Surrogate-Assisted Evolutionary Multi-Objective Full Model Selection

Abstract

Classification problems have become a popular task in pattern recognition. This is, perhaps, because they can be used in a number of problems, such as text categorization, handwriting recognition, etc. This has resulted in a large number of methods. Some of theses methods, called pre-processing, aim at preparing the data to be used and others, called learning algorithms, aim at learning a model that maps from the input data into a category. Additionally, most of them have a set of adjustable parameters, called hyper-parameters, that directly impact the performance of the learned models. Hence, when a classification model is constructed, one has to choose among the set of methods and to configure the corresponding hyper-parameters, which can result in a decision with a high number of degrees of freedom. The latter could be a shortcoming when non-expert machine learning users have to face such a problem. This thesis deals with the problem of full model selection, which is defined as the problem of finding a combination of pre-processing methods and learning algorithms together with the hyper-parameters that best fit to a dataset. Traditionally full model selection has been approached as a single objective optimization problem and only considered up to two main types of pre-processing. Here, we face this in a broader sense by considering four types of pre-processing and as a multi-objective optimization problem. The multi-objective formulation allows accounting two or more criteria in the optimization stage looking for those solutions with a good trade-off. Here, we used two criteria widely adopted in machine learning, which are the error rate and model complexity. Evolutionary Algorithms have gained popularity for dealing with multi-objective optimization problems. In recent years, they have been successfully applied to solving different supervised/machine learning problems. However, a drawback of them is that they require a relatively high number of objective functions evaluations in order to get a reasonably good approximation towards the optimal solutions. In this thesis we explore the use of surrogate assisted optimization to deal with the problem of multi-objective full model selection. We have proposed three methods for handling the problem of model selection for non-expert users. The first method considers different model types and makes use of the VC Dimension theory to estimate in a general and straightforward fashion the model complexity. This approach is called MOTMS: Multi-Objective Model Type Selection and shows the effectiveness of the VC Dimension for the aim that we look for. The second approach formulates the full model selection problem as a nested multiobjective optimization one and is called EN-MOMS-PbE: Evolutionary Nested Multi- Objective full Model Selection with Pareto-based Ensembles. Thus, the optimization component has to deal with two optimization levels, in the upper level the model structure is optimized while in the lower level the hyper-parameters for a given model structure are optimized. This second approach also proposes seven strategies for dealing with the trade-off solutions. An experimental comparison is performed under them and we found that a solution based on evolutionary ensemble performs better than the others. EN-MOMS-PbE shows a competitive performance when compared to state of the art model selection methods. Finally, in order to improve the efficiency of our proposal, the third approach explores the idea of using surrogate-assisted optimization for reducing the number of fitness evaluations required by the evolutionary algorithm. The surrogates are incorporated in the lower optimization of EN-MOMS-PbE, because most of the evaluations are performed at this stage. This third approach is called SEN-MOMS-PbE: Surrogate Evolutionary Nested Multi-Objective full Model Selection with Pareto-based Ensembles. The experimental evaluation shows that SEN-MOMS-PbE is able to significantly reduce the number of evaluations while preserving almost the same performance of EN-MOMS-PbE.