Volume 15, Number 1
Imbalanced Dataset Effect on CNN-Based Classifier Performance for Face Recognition
Authors
Miftah Asharaf Najeeb and Alhaam Alariyibi, University of Benghazi, Libya
Abstract
Facial Recognition is integral to numerous modern applications, such as security systems, social media platforms, and augmented reality apps. The success of these systems heavily depends on the performance of the Face Recognition models they use, specifically Convolutional Neural Networks (CNNs). However, many real-world classification tasks encounter imbalanced datasets, with some classes significantly underrepresented. Face Recognition models that do not address this class imbalance tend to exhibit poor performance, especially in tasks involving a wide range of faces to identify (multi-class problems). This research examines how class imbalance in datasets impacts the creation of neural network classifiers for Facial Recognition. Initially, we crafted a Convolutional Neural Network model for facial recognition, integrating hybrid resampling methods (oversampling and under-sampling) to address datasetimbalances. In addition, augmentation techniques were implemented to enhance generalization capabilities and overall performance. Through comprehensive experimentation, we assess the influenceof imbalanced datasets on the performance of the CNN-based classifier. Using Pins face data, we conducted an empirical study, evaluating conclusions based on accuracy, precision, recall, and F1-score measurements. A comparative analysis demonstrates that the performance of the proposed Convolutional Neural Network classifier diminishes in the presence of dataset class imbalances. Conversely, the proposed system, utilizing data resampling techniques, notably enhances classification performance for imbalanced datasets. This study underscores the efficacy of data resampling approaches in augmenting the performance of Face Recognition models, presenting prospects for more dependable and efficient future systems.
Keywords
Face Recognition (FR), Convolutional Neural Network (CNN), Imbalanced Class Data, Resampling Techniques.