Due to a limited control over changing operational conditions and personal physiology, systems used for video-based face recognition are confronted with complex and changing pattern recognition environments. Although a limited amount of reference data is initially available during enrollment, new samples often become available over time, through re-enrollment, post analysis and labeling of operational data, etc. Adaptive multi-classifier systems (AMCSs) are therefore desirable for the design and incremental update of facial models. For real time recognition of individuals appearing in video sequences, facial regions are captured with one or more cameras, and an AMCS must perform fast and efficient matching against the facial model of individual enrolled to the system. In this paper, an incremental learning strategy based on particle swarm optimization (PSO) is proposed to efficiently evolve heterogeneous classifier ensembles in response to new reference data. This strategy is applied to an AMCS where all parameters of a pool of fuzzy ARTMAP (FAM) neural network classifiers (i.e., a swarm of classifiers), each one corresponding to a particle, are co-optimized such that both error rate and network size are minimized. To provide a high level of accuracy over time while minimizing the computational complexity, the AMCS integrates information from multiple diverse classifiers, where learning is guided by an aggregated dynamical niching PSO (ADNPSO) algorithm that optimizes networks according both these objectives. Moreover, pools of FAM networks are evolved to maintain (1) genotype diversity of solutions around local optima in the optimization search space and (2) phenotype diversity in the objective space. Accurate and low cost ensembles are thereby designed by selecting classifiers on the basis of accuracy, and both genotype and phenotype diversity. For proof-of-concept validation, the proposed strategy is compared to AMCSs where incremental learning of FAM networks is guided through mono-and multi-objective optimization. Performance is assessed in terms of video-based error rate and resource requirements under different incremental learning scenarios, where new data is extracted from real-world video streams (IIT-NRC and MoBo). Simulation results indicate that the proposed strategy provides a level of accuracy that is comparable to that of using mono-objective optimization and reference face recognition systems, yet requires a fraction of the computational cost (between 16% and 20% of a mono-objective strategy depending on the data base and scenario). (C) 2012 Elsevier B. V. All rights reserved.