Publication:
Feature selection for high-dimensional data using a multivariate search space reduction strategy based scatter search

Loading...
Thumbnail Image

Publication date

Reading date

Event date

Start date of the public exhibition period

End date of the public exhibition period

Advisors

Authors of photography

Person who provides the photography

Journal Title

Journal ISSN

Volume Title

Publisher

Springer Nature
Export

Research Projects

Organizational Units

Journal Issue

Abstract

In feature selection, the increasing of the dimensionality and the complexity of feature interactions make the problem challenging. Furthermore, searching for an optimal subset of features from a high-dimensional feature space is known to be an N P-hard problem. To improve the efficiency and effectiveness of the search algorithm, feature grouping has emerged as a way to reduce the search space by clustering features according to a measure. In this work we propose to reduce the search space by applying a greedy algorithm, called Multivariate Greedy Predominant Groups Generator (MGPGG). MGPGG extends the idea of the Greedy Predominant Groups Generator (GPGG) algorithm by taking into account feature interaction among three or more features. For this purpose, MGPGG uses the Multivariate Symmetrical Uncertainty (MSU) to group features that share information about the class label. We also propose a Scatter Search strategy that integrates MGPGG to find small subsets of features with high predictive power. The proposed algorithm, called Multivariate Predominant Group-based Scatter Search (MPGSS), is tested on high-dimensional data from biomedical and text-mining fields. The proposal is compared with state-of-the-art feature selection strategies. Results show that MPGSS is competitive since it is capable of finding small subsets of features while keeping high predictive classification models.

Doctoral program

Related publication

Research projects

info:eu-repo/grantAgreement/AEI/Plan Estatal de Investigación Científica y Técnica y de Innovación 2017-2020/PID2020-117954RB-C21/ES/APRENDIZAJE PROFUNDO Y APRENDIZAJE ONLINE EXPLICABLES PARA SOSTENIBILIDAD/
PY20-00870
UPO-138516

Description

Bibliographic reference

Garcia-Torres, M. (2025). Feature selection for high-dimensional data using a multivariate search space reduction strategy based scatter search. Journal of Heruristics, 31(10), 1-33.

Photography rights