keyboard_arrow_up
Decision Making in Scientific Machine Learning

Authors

Jiawei Zhang 1, Xin Zhang 1 and Xinyin Miao 2, 1 PRA Group (Nasdaq: PRAA), USA, 2 American Airlines Group Inc (Nasdaq: AAL), USA

Abstract

This paper provides an innovative approach for data imbalance handling, namely partial penalty, to enhance the machine learning application in credit card fraud detection field.Such approach avoids the misleading dataor data missing issue brought by traditional over-sampling or under-sampling approaches, keeps the training data same as validation and testing data, and realizes a higher performance in both validation and testing scenarios.Under the partial penalty methodology, we’ve also applied five machine learning models, including Logistic Regression, Random Forest, kNN, Decision Tree, and Light Gradient Boosting, and achieves 88.35% F1 score and 98.79% AUC score in testing scenario.


Keywords

Partial Penalty, Gradient Boosting, Data Imbalance, Credit Card Fraud Detection, SMOTE