Volume 11, Number 4

Log Message Anomaly Detection with Oversampling


Amir Farzad and T. Aaron Gulliver, University of Victoria, Canada


Imbalanced data is a significant challenge in classification with machine learning algorithms. This is particularly important with log message data as negative logs are sparse so this data is typically imbalanced. In this paper, a model to generate text log messages is proposed which employs a SeqGAN network. An Autoencoder is used for feature extraction and anomaly detection is done using a GRU network. The proposed model is evaluated with three imbalanced log data sets, namely BGL, OpenStack, and Thunderbird. Results are presented which show that appropriate oversampling and data balancing improves anomaly detection accuracy.


Deep Learning, Oversampling, Log messages, Anomaly detection, Classification.