A fast and Scalable Multiobjective Genetic Fuzzy System for Linguistic Fuzzy Modeling in High-Dimensional Regression Problems

Abstract

Linguistic fuzzy modeling in high-dimensional regression problems poses the challenge of exponential-rule explosion when the number of variables and/or instances becomes high. One way to address this problem is by determining the used variables, the linguistic partitioning and the rule set together, in order to only evolve very simple, but still accurate models. However, evolving these components together is a difficult task, which involves a complex search space. In this study, we propose an effective multiobjective evolutionary algorithm that, based on embedded genetic database (DB) learning (involved variables, granularities, and slight fuzzy-partition displacements), allows the fast learning of simple and quite-accurate linguistic models. Some efficient mechanisms have been designed to ensure a very fast, but not premature, convergence in problems with a high number of variables. Further, since additional problems could arise for datasets with a large number of instances, we also propose a general mechanism for the estimation of the model error when using evolutionary algorithms, by only considering a reduced subset of the examples. By doing so, we can also apply a fast postprocessing stage for further refining the learned solutions. We tested our approach on 17 real-world datasets with different numbers of variables and instances. Three well-known methods based on embedded genetic DB learning have been executed as references. We compared the different approaches by applying nonparametric statistical tests for multiple comparisons. The results confirm the effectiveness of the proposed method not only in terms of scalability but in terms of the simplicity and generalizability of the obtained models as well.