Comparison of Malware Classification Methods using Convolutional Neural Network based on API Call Stream

Matthew Schofield; Gulsum Alicioglu; Bo Sun; Russell Binaco; Paul Turner; Cameron Thatcher; Alex Lam; Anthony Breitzman; Matthew Schofield; Gulsum Alicioglu; Bo Sun; Russell Binaco; Paul Turner; Cameron Thatcher; Alex Lam; Anthony Breitzman

doi:10.5121/ijnsa.2021.13201

Volume 13, Number 2

Comparison of Malware Classification Methods using Convolutional Neural Network based on API Call Stream

Authors

Matthew Schofield, Gulsum Alicioglu, Bo Sun, Russell Binaco, Paul Turner, Cameron Thatcher, Alex Lam and Anthony Breitzman, Rowan University, USA

Abstract

Malicious software is constantly being developed and improved, so detection and classification of malwareis an ever-evolving problem. Since traditional malware detection techniques fail to detect new/unknown malware, machine learning algorithms have been used to overcome this disadvantage. We present a Convolutional Neural Network (CNN) for malware type classification based on the API (Application Program Interface) calls. This research uses a database of 7107 instances of API call streams and 8 different malware types:Adware, Backdoor, Downloader, Dropper, Spyware, Trojan, Virus,Worm. We used a 1-Dimensional CNN by mapping API calls as categorical and term frequency-inverse document frequency (TF-IDF) vectors and compared the results to other classification techniques.The proposed 1-D CNN outperformed other classification techniques with 91% overall accuracy for both categorical and TFIDF vectors.

Keywords

Convolutional Neural Network, Malware Classification, N-gram Analysis, Term Frequency-Inverse Document Frequency Vectors, Windows API Calls.