Analysis of Machine Learning Algorithms with Feature Selection for Intrusion Detection using UNSW-NB15 Dataset

Geeta Kocher; Gulshan Kumar; Geeta Kocher; Gulshan Kumar; Geeta Kocher; Gulshan Kumar

doi:10.5121/ijnsa.2021.13102

Volume 13, Number 1

Analysis of Machine Learning Algorithms with Feature Selection
for Intrusion Detection using UNSW-NB15 Dataset

Authors

Geeta Kocher¹ and Gulshan Kumar², ¹MRSPTU, India, ²SBSSTC, India

Abstract

In recent times, various machine learning classifiers are used to improve network intrusion detection. The researchers have proposed many solutions for intrusion detection in the literature. The machine learning classifiers are trained on older datasets for intrusion detection, which limits their detection accuracy. So, there is a need to train the machine learning classifiers on the latest dataset. In this paper, UNSW-NB15, the latest dataset is used to train machine learning classifiers. The selected classifiers such as K-Nearest Neighbors (KNN), Stochastic Gradient Descent (SGD), Random Forest (RF), Logistic Regression (LR), and Naïve Bayes (NB) classifiers are used for training from the taxonomy of classifiers based on lazy and eager learners. In this paper, Chi-Square, a filter-based feature selection technique, is applied to the UNSW-NB15 dataset to reduce the irrelevant and redundant features. The performance of classifiers is measured in terms of Accuracy, Mean Squared Error (MSE), Precision, Recall, F1-Score, True Positive Rate (TPR) and False Positive Rate (FPR) with or without feature selection technique and comparative analysis of these machine learning classifiers is carried out.

Keywords

Intrusion Detection System, MSE, SGD, UNSW-NB15, Machine Learning Algorithms.