Volume 17, Number 5
Synergy Analysis of Ensemble Feature Selection on Performance Amelioration of Intrusion Detection System
Authors
S.Vijayalakshmi and V.Prasanna Venkatesan, Pondicherry University, India
Abstract
Unparalleled massive generation of online data by social media platforms, digital banking, networking applications and communication portals have mandated the application of data preprocessing technique in the initial stage for the machine learning models to easily discern patterns/association in the data analysis and classification task. To realize this, effective feature extraction and selection methods have been proposed to simplify the data architecture and relationship between them. This underpins the need for implementing Feature selection in the initial stages of the machine learning pipeline where the decent representation of data becomes available to describe the problem more effectively and clearly. The pruned data generated by these techniques is aimed at effective and timely analysis of the organizational information to decipher any impending threats on the flow of network packets. Collective decisions generated from multiple feature selection techniques surpass the results generated by single feature selection method. This collective ensemble strategy applied in feature selection techniques helps in ameliorating the performance of intrusion detection system inducted in the organizational network. The employment of ensemble design in the feature selection methods holistically improves the IDS performance by enhancing classification efficiency, robustness, stability in accentuating the association between the feature sets with the attack signature (Attack class-oriented feature subset mapping) even when there is disturbance/distortion in the training dataset. This paper thoroughly analyses the efficacy of improving the IDS performance through application of ensemble architecture to feature selection techniques empowered with adoption of DESIRE (Diversity, Equity, Scalability, Inclusivity, Reproducibility (stability) and Enhance Performance) characteristics as highlighted in respective Graphs using NSL-KDD dataset. The diversity generating mechanism instituted in ensemble architecture through data perturbation, function perturbation and hybrid perturbation strategies promises comprehensive coverage of the training set by incorporating cross validation strategies and random sampling techniques
Keywords
Ensemble Feature Selection, Intrusion, Diversity, Equity, Scalability, Inclusivity, Reproducibility (Sensitivity), Performance, Classification Efficiency.