Authors
Amarachi Blessing Mbakwe and Sikiru Ademola Adewale, Virginia Tech, USA
Abstract
Fraud is a critical issue in our society today. Losses due to payment fraud are on the increase as ecommerce keeps evolving. Organizations, governments, and individuals have experienced huge losses due to payment. Merchant Savvy projects that global losses due to payment fraud will increase to about $40.62 billion in 2027 . Among all payment fraud, credit card fraud results in a higher loss. Therefore, we intend to leverage the potential of machine learning to deal with the problem of fraud in credit cards which can be generalized to other fraud types. This paper compares the performance of logistic regression, decision trees, random forest classifier, isolation forest, local outlier factor, and one-class support vector machines (SVM) based on their AUC and F1-score. We applied a smote technique to handle the imbalanced nature of the data and compared the performance of the supervised models on the oversampled data to the raw data. From the results, the Random Forest classifier outperformed the other models with a higher AUC score and better f1-score on both the actual and oversampled data. Oversampling the data didn't change the result of the decision trees. One-class SVM performs better than isolation forest in terms of AUC score but has a very low f1-score compared to isolation forest. The local outlier factor had the poorest performance.
Keywords
Credit card, fraud, detection, Isolation Forest, One-class SVM, Supervised algorithms.