Publication:
A multivariate approach to the symmetrical uncertainty measure: Application to feature selection problem

dc.contributor.authorSosa Cabrera, Gustavo
dc.contributor.authorGarcía Torres, Miguel
dc.contributor.authorGómez Guerrero, Santiago
dc.contributor.authorE. Schaerer, Christian
dc.contributor.authorDivina, Federico
dc.date.accessioned2024-02-01T09:31:48Z
dc.date.available2024-02-01T09:31:48Z
dc.date.issued2019
dc.description.abstractIn this work we propose an extension of the Symmetrical Uncertainty (SU) measure in order to address the multivariate case, simultaneously acquiring the capability to detect possible correlations and interactions among features. This generalization, denoted Multivariate Symmetrical Uncertainty (MSU), is based on the concepts of Total Correlation (TC) and Mutual Information (MI) extended to the multivariate case. The generalized measure accounts for the total amount of dependency within a set of variables as a single monolithic quantity. Multivariate measures are usually biased due to several factors. To overcome this problem, a mathematical expression is proposed, based on the cardinality of all features, which can be used to calculate the number of samples needed to estimate the MSU without bias at a pre-specified significance level. Theoretical and experimental results on synthetic data show that the proposed sample size expression properly controls the bias. In addition, when the MSU is applied to feature selection on synthetic and real-world data, it has the advantage of adequately capturing linear and nonlinear correlations and interactions, and it can therefore be used as a new feature subset evaluation method.
dc.description.sponsorshipDeporte e Informática
dc.format.mimetypeapplication/pdf
dc.identifier.citationInformation Sciences, vol. 494, p. 1-20
dc.identifier.doi10.1016/j.ins.2019.04.046
dc.identifier.urihttps://hdl.handle.net/10433/19571
dc.language.isoen
dc.publisherElsevier
dc.relation.projectIDinfo:eu-repo/grantAgreement/MINECO//TIN2015-64776-C3-2-R/ES/DIFFERENTIAL@UPO: MASSIVE DATA MANAGEMENT, FILTERING AND EXPLORATORY ANALYSIS/
dc.rightsAttribution-NonCommercial-NoDerivatives 4.0 Internationalen
dc.rights.accessRightsopen access
dc.rights.urihttp://creativecommons.org/licenses/by-nc-nd/4.0/
dc.subjectMultivariate symmetrical uncertainty
dc.subjectMutual information
dc.subjectEntropy
dc.subjectFeature selection
dc.titleA multivariate approach to the symmetrical uncertainty measure: Application to feature selection problem
dc.typejournal article
dc.type.hasVersionAM
dspace.entity.typePublication
relation.isAuthorOfPublication4ce19614-9553-49b0-9b6e-09817f551658
relation.isAuthorOfPublication82e2c456-c4b8-494e-b3d9-f6c84c8cf9a5
relation.isAuthorOfPublication.latestForDiscovery4ce19614-9553-49b0-9b6e-09817f551658

Files

Original bundle

Now showing 1 - 1 of 1
Loading...
Thumbnail Image
Name:
03-IS.pdf
Size:
312.26 KB
Format:
Adobe Portable Document Format