Genetic Training Instance Selection in Multiobjective Evolutionary Fuzzy Systems: A Coevolutionary Approach

Abstract

When dealing with datasets that are characterized by a large number of instances, multiobjective evolutionary learning (MOEL) of fuzzy rule-based systems (FRBSs) suffers from high computational costs, mainly because of the fitness evaluation. The use of a reduced set of representative instances in place of the overall training set (TS) would considerably lessen the computational effort. Even though a large number of papers have proposed instance selection approaches, mainly in classification problems, how this selection should be performed, especially in the context of regression, is still an open issue. In this paper, we tackle the instance selection problem in the framework of MOEL of FRBSs through a coevolutionary approach. In the execution of the MOEL, periodically, a single-objective genetic algorithm (SOGA) evolves a population of reduced TSs. The SOGA aims to maximize a purposely defined index which measures how much the Pareto fronts computed by using, respectively, the reduced TS and the overall TS are close to each other: The closer the fronts, the more the reduced TS is representative of the overall TS. During the execution of the MOEL, the rule base and the membership function parameters of the fuzzy sets are concurrently learned by maximizing the accuracy and minimizing the complexity. We tested our approach on 12 large datasets. We adopted reduced TSs composed of 5%, 10%, and 20% of the overall TS. Using nonparametric statistical tests, we verified that with 10% and 20% of the overall TS, the Pareto front approximations that are generated by our coevolutionary approach are comparable with the ones generated by applying the MOEL with the overall TS, although the coevolution allows us to save up to 86.36% of the execution time. In addition, the analysis of the behavior of three representative solutions on the test set highlights that the use of the reduced TSs does not affect the generalization capabilities of the generated FRBSs.