Comparison of Machine Learning Algorithms for Credit Score-based Banking Customer Churn Prediction

Suryadillah Hendrawinata, Jasmir Jasmir, Gunardi Gunardi

Abstract


A high customer churn rate represents a significant challenge for the banking industry, leading to substantial financial losses and higher acquisition costs for new customers. Proactively identifying customers who are likely to churn is essential for implementing effective retention strategies. This study aims to address this issue by implementing and comprehensively comparing three different machine learning classification algorithms: Logistic Regression, Random Forest, and XGBoost.
The study utilized a secondary dataset consisting of bank customer profiles from 10,000 customers with various characteristics, including credit scores, account balances, and transaction activities. The research methodology followed the Cross-Industry Standard Process for Data Mining (CRISP-DM) framework. The models were evaluated using several metrics, including Accuracy, Precision, Recall, F1-Score, and ROC-AUC. The findings indicate that the ensemble models significantly outperformed the linear model (Logistic Regression), which achieved an F1-Score of only 0.286. Random Forest emerged as the best-performing model in this study, achieving the highest Accuracy (0.864), F1-Score (0.590), and ROC-AUC (0.852). In comparison, XGBoost demonstrated competitive performance with an F1-Score of 0.579 and a ROC-AUC of 0.832. The study concludes that Random Forest provides the most optimal overall performance, offering the strongest capability for identifying at-risk customers within the dataset.

Keywords


classification; customer churn; CRISP-DM; random forest; XGBoost

Full Text:

PDF

References


P. P. Singh, F. I. Anik, R. Senapati, A. Sinha, N. Sakib, and E. Hossain, “Investigating Customer Churn in Banking: A Machine Learning Approach and Visualization App for Data Science and Management,” Data Science and Management, Vol. 7, No. 1, pp. 7–16, Mar. 2024, DOI: 10.1016/j.dsm.2023.09.002.

B. A. Maulana and N. Hidayati, “Churn Prediction in Credit Customers using Random Forest and XGBoost Methods,” Indonesian Journal of Data and Science, Vol. 6, No. 1, pp. 82–90, Mar. 2025, DOI: 10.56705/ijodas.v6i1.215.

R. E. Ako et al., “Effects of Data Resampling on Predicting Customer Churn via a Comparative Tree-based Random Forest and XGBoost,” Journal of Computing Theories and Applications, Vol. 2, No. 1, pp. 86–101, Jun. 2024, DOI: 10.62411/jcta.10562.

D. A. Kusuma, A. R. Dewi, and A. R. Wijaya, “Perbandingan Random Forest dan Convolutional Neural Network dalam memprediksi Peralihan Pelanggan,” MEI, 2025.

L. N. Wakhidah, A. K. Zyen, and B. B. Wahono, “Evaluation of Telecommunication Customer Churn Classification with SMOTE using Random Forest and XGBoost Algorithms,” 2025. [Online]. Available: http://jurnal.polibatam.ac.id/index.php/JAIC

J. Melvin Ayu Soraya Dachi and Pardomuan Sitompul, “Analisis Perbandingan Algoritma XGBoost dan Algoritma Random Forest Ensemble Learning pada Klasifikasi Keputusan Kredit,” Jurnal Riset Rumpun Matematika dan Ilmu Pengetahuan Alam, Vol. 2, No. 2, pp. 65–71, Jul. 2023, DOI: 10.55606/jurrimipa.v2i2.1336.

M. R. Givari, R. Mochamad, and Y. U. Sulaeman2, “Perbandingan Algoritma SVM, Random Forest dan XGBoost untuk Penentuan Persetujuan Pengajuan Kredit,” Vol. 16, No. 1, 2022, [Online]. Available: https://journal.uniku.ac.id/index.php/ilkom

M. Basri, M. Iqbal Pradipta, and K. Aditya, “A Comparative Study : Predicting Customer Churn in Banking Using Logistic Regression & Random Forest,” Ultimatics : Jurnal Teknik Informatika, Vol. 17, No. 1, 2025.

Y. Xu, C. Rao, X. Xiao, and F. Hu, “Novel Early-Warning Model for Customer Churn of Credit Card based on Gsaibas-Catboost,” CMES - Computer Modeling in Engineering and Sciences, Vol. 137, No. 3, pp. 2715–2742, 2023, DOI: 10.32604/cmes.2023.029023.

E. J. Casabianca, M. Catalano, L. Forni, E. Giarda, and S. Passeri, “A Machine Learning Approach to Rank the Determinants of Banking Crises Over Time and Across Countries,” J. Int. Money Finance, Vol. 129, Dec. 2022, DOI: 10.1016/j.jimonfin.2022.102739.

G. Gunawan, “Data Mining using Crisp-Dm Process Framework on Official Statistics: A Case Study of East Java Province,” Jurnal Ekonomi dan Pembangunan, Vol. 29, No. 2, pp. 183–198, Dec. 2021, DOI: 10.14203/jep.29.2.2021.183-198.

P. P. Singh, F. I. Anik, R. Senapati, A. Sinha, N. Sakib, and E. Hossain, “Investigating Customer Churn in Banking: A Machine Learning Approach and Visualization App for Data Science and Management,” Data Science and Management, Vol. 7, No. 1, pp. 7–16, Mar. 2024, DOI: 10.1016/j.dsm.2023.09.002.

C. Schröer, F. Kruse, and J. M. Gómez, “A Systematic Literature Review on Applying CRISP-DM Process Model,” in Procedia Computer Science, Elsevier B.V., 2021, pp. 526–534. DOI: 10.1016/j.procs.2021.01.199.

K. Anam and A. R. Rinaldi, “Komparasi Algoritma Machine Learning dalam Klasifikasi Loyalitas Nasabah Bank berbasis Particle Swarm Optimization,” 2024. [Online]. Available: https://www.kaggle.com/

R. A. Casonatto, T. D. P. G. Souza, and A. M. Mariano, “Quality and Risk Management in Data Mining: A CRISP-DM Perspective,” in Procedia Computer Science, Elsevier B.V., 2024, pp. 161–168. DOI: 10.1016/j.procs.2024.08.257.

C. Mariscal, Y. Yustiawan, F. C. Rochim, and E. Tanuar, “Implementing and Analyzing Fairness in Banking Credit Scoring,” in Procedia Computer Science, Elsevier B.V., 2024, pp. 1492–1499. DOI: 10.1016/j.procs.2024.03.150.

M. Alonso Dos Santos, C. Zarco-Fernández, and F. Liébana-Cabanillas, “The New Frontier of Customer Understanding: Financial Satisfaction and AutoML in Banking,” Journal of Retailing and Consumer Services, Vol. 90, Mar. 2026, DOI: 10.1016/j.jretconser.2025.104601.




DOI: https://doi.org/10.32520/stmsi.v15i5.6148

Article Metrics

Abstract view : 30 times
PDF - 7 times

Refbacks

  • There are currently no refbacks.


Creative Commons License
This work is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License.