Publication:
Feature selection for high-dimensional data using a multivariate search space reduction strategy based scatter search

dc.contributor.authorGarcía Torres, Miguel
dc.date.accessioned2025-01-28T09:39:54Z
dc.date.available2025-01-28T09:39:54Z
dc.date.issued2025-01-25
dc.description.abstractIn feature selection, the increasing of the dimensionality and the complexity of feature interactions make the problem challenging. Furthermore, searching for an optimal subset of features from a high-dimensional feature space is known to be an N P-hard problem. To improve the efficiency and effectiveness of the search algorithm, feature grouping has emerged as a way to reduce the search space by clustering features according to a measure. In this work we propose to reduce the search space by applying a greedy algorithm, called Multivariate Greedy Predominant Groups Generator (MGPGG). MGPGG extends the idea of the Greedy Predominant Groups Generator (GPGG) algorithm by taking into account feature interaction among three or more features. For this purpose, MGPGG uses the Multivariate Symmetrical Uncertainty (MSU) to group features that share information about the class label. We also propose a Scatter Search strategy that integrates MGPGG to find small subsets of features with high predictive power. The proposed algorithm, called Multivariate Predominant Group-based Scatter Search (MPGSS), is tested on high-dimensional data from biomedical and text-mining fields. The proposal is compared with state-of-the-art feature selection strategies. Results show that MPGSS is competitive since it is capable of finding small subsets of features while keeping high predictive classification models.
dc.description.sponsorshipData Science and Big Data Lab,
dc.format.mimetypeapplication/pdf
dc.identifier.citationGarcia-Torres, M. (2025). Feature selection for high-dimensional data using a multivariate search space reduction strategy based scatter search. Journal of Heruristics, 31(10), 1-33.
dc.identifier.doi0.1007/s10732-025-09550-9
dc.identifier.urihttps://hdl.handle.net/10433/22727
dc.language.isoen
dc.publisherSpringer Nature
dc.relation.projectIDinfo:eu-repo/grantAgreement/AEI/Plan Estatal de Investigación Científica y Técnica y de Innovación 2017-2020/PID2020-117954RB-C21/ES/APRENDIZAJE PROFUNDO Y APRENDIZAJE ONLINE EXPLICABLES PARA SOSTENIBILIDAD/
dc.relation.projectIDPY20-00870
dc.relation.projectIDUPO-138516
dc.rightsAttribution-NonCommercial 4.0 Internationalen
dc.rights.accessRightsopen access
dc.rights.urihttp://creativecommons.org/licenses/by-nc/4.0/
dc.subjectFeature selection
dc.subjectHigh-dimensional data
dc.subjectScatter search
dc.subjectFeature grouping
dc.subjectSearch space reduction
dc.subjectMultivariate symmetrical uncertainty
dc.titleFeature selection for high-dimensional data using a multivariate search space reduction strategy based scatter search
dc.typejournal article
dc.type.hasVersionVoR
dspace.entity.typePublication
relation.isAuthorOfPublication4ce19614-9553-49b0-9b6e-09817f551658
relation.isAuthorOfPublication.latestForDiscovery4ce19614-9553-49b0-9b6e-09817f551658

Files

Original bundle

Now showing 1 - 1 of 1
Loading...
Thumbnail Image
Name:
s10732-025-09550-9-1.pdf
Size:
737.25 KB
Format:
Adobe Portable Document Format