Feature Selection in Supervised and Unsupervised Learning via Evolutionary Search

Abstract

Feature subset selection is an important problem in knowledge discovery, not only for the insight gained from determining relevant modeling variables but also for the improved understandability, scalability, and possibly, accuracy of the resulting models. The purpose of this study is to provide a comprehensive analysis of feature selection via evolutionary search in supervised and unsupervised learning. To achieve this purpose, it first discusses a general framework for feature selection based on a new search algorithm, Evolutionary Local Selection Algorithm (ELSA). ELSA searches for a good set of feature subset in a multi-dimensional objective space and can be combined with any supervised and unsupervised learning algorithms. In supervised learning, we train an induction algorithm to maximize the classification accuracy for unseen data while minimizing the size of feature subsets. Using Artificial Neural Networks (ANNs) as an induction algorithm, we apply an ELSA/ANN model to a real business problem, customer targeting, and address the issues related with knowledge discovery. Focusing on feature selection for creating diversity in a classification ensemble, we provide a new two-level evolutionary algorithm, Meta-Evolutionary Ensemble (MEE) to create an optimal ensemble. In this algorithm, the various ensembles compete with one another, being judged on their estimated predictive performance. In addition, the underlying classifiers also compete with each other, being rewarded for correctly predicting the training examples. In this way we aim to optimize ensembles rather than form ensembles of individually-optimized classifiers. However, we often cannot apply supervised learning for lack of a training signal. For these cases, we propose a new feature selection approach based on clustering and discuss its significance to the real-world problems in marketing and finance. Further, we show that our new approach can be used to form the given number of clusters based on the selected features.