Implementation of the K-Means Clustering Algorithm for Customer Segmentation

Kapti Kapti, Dwi Astuti

Abstract


With the rapid advancement of digital technology, companies are faced with an increasing volume and complexity of customer data. This necessitates a more effective approach to customer segmentation in order to better understand consumer behavior patterns and preferences. The large number of customers conducting transactions daily makes it challenging for managers to distinguish between highly frequent shoppers and less frequent ones based on transaction data and purchasing preferences. Therefore, the role of the K-Means clustering algorithm is crucial in addressing this issue. The objective of this study is to implement the K-Means Clustering algorithm for customer segmentation by grouping customers into three clusters: highly frequent shoppers, moderately frequent shoppers, and infrequent shoppers. The research methodology includes the following steps: data collection of transactions and customer preferences, data preprocessing, algorithm implementation, and validation and testing. The clustering parameters are based on the number of purchases, number of transactions, and quantity of items purchased. The functional testing results indicate that the system performs well, as all test scenarios were successfully executed. Furthermore, the evaluation using the Silhouette Coefficient (SC) method produced a strong structure status, with an average SC value of 0.97. This result demonstrates that the dataset is highly robust and suitable to serve as a reference model for customer segmentation.

Keywords


Clustering; Grouping; Drugs; K-Means Clustering; Health Center

Full Text:

PDF

References


D. Suprayitno, N. D. Irmadiani, M. Munizu, M. Muchayatin, I. Mawarni, S. Saktisyahputra, and E. Erwin, Manajemen Pemasaran: Teori dan Strategi. PT. Green Pustaka Indonesia, 2024.

E. U. Oti, M. O. Olusola, F. C. Eze, and S. U. Enogwe, “Comprehensive Review of K-Means Clustering Algorithms,” International Journal of Advanced Scientific Research and Engineering, Vol. 7, No. 8, pp. 64–69, 2021. [Online]. Available:https://doi.org/10.31695/IJASRE.2021.34050

Y. Zhou, R. Wang, R. Ding, D. Shi, and Q. Ye, “Investigation on Hierarchical Control for Driving Stability and Safety of Intelligent HEV During Car-Following and Lane-Change Process,” Science China Technological Sciences, Vol. 65, No. 1, pp. 53–76, 2022. [Online]. Available: https://doi.org/10.1007/s11431-021-1891-8

V. Duarte, S. Zuniga-Jara, and S. Contreras, “Machine Learning and Marketing: A Systematic Literature Review,” IEEE Access, Vol. 10, pp. 93273–93288, Aug. 2022. [Online]. Available: http://doi.org/10.1109/ACCESS.2022.3202896

X. H. Vázquez, “Adapting for Sustainable Success: Navigating Change in Strategic Management,” Doctoral Dissertation, 2023. [Online]. Available: https://www.investigo.biblioteca.uvigo.es/xmlui/handle/11093/5523

L. Zhang and Z. Lu, “Advances in InSAR Imaging and Data Processing—A Review,” Remote Sensing, Vol. 14, No. 17, pp. 1–7, 2022. [Online]. Available: https://doi.org/10.3390/rs14174307

X. Sáez-de-Cámara, J. L. Flores, C. Arellano, A. Urbieta, and U. Zurutuza, “Clustered Federated Learning Architecture for Network Anomaly Detection in Large Scale Heterogeneous IoT networks,” Computers & Security, Vol. 131, 2023. [Online]. Available: https://doi.org/10.1016/j.cose.2023.103299

P. Singh, L. Khoshaim, B. Nuwisser, and I. Alhassan, “How Information Technology (IT) is Shaping Consumer Behavior in the Digital Age: A Systematic Review and Future Research Directions,” Sustainability, Vol. 16, No. 4, 2024. [Online]. Available: https://doi.org/10.3390/su16041556

D. O. Hassan and B. A. Hassan, “A Comprehensive Systematic Review of Machine Learning in the Retail Industry: Classifications, Limitations, Opportunities, and Challenges,” Neural Computing and Applications, Vol. 37, No. 4, pp. 2035–2070, 2025. [Online]. Available: https://doi.org/10.1007/s00521-024-10869-w

E. Dritsas and M. Trigka, “Exploring the Intersection of Machine Learning and Big Data: A Survey,” Machine Learning and Knowledge Extraction, Vol. 7, No. 1, 2025. [Online]. Available: https://doi.org/10.3390/make7010013

S. Jabeen, “The Landscape of CRM Research: A Bibliometric Analysis of Key Trends and Future Directions,” Journal of Relationship Marketing, pp. 1–41, 2025. [Online]. Available: https://doi.org/10.1080/15332667.2025.2462884

S. Zhou, S. Bi, and G. Qi, “Rule Mining Trends from 1987 to 2022: A Bibliometric Analysis and Visualization,” Data Intelligence, pp. 1–44, 2023. [Online]. Available: https://doi.org/10.1162/dint_a_00239

H. Liu, Y. Luo, J. Geng, and P. Yao, “Research Hotspots and Frontiers of Product R&D Management under the Background of the Digital Intelligence Era—Bibliometrics based on CiteSpace and HistCite,” Applied Sciences, Vol. 11, No. 15, 2021. [Online]. Available: https://doi.org/10.3390/app11156759

J. Yuan, C. Chen, W. Yang, M. Liu, J. Xia, and S. Liu, “A Survey of Visual Analytics Techniques for Machine Learning,” Computational Visual Media, Vol. 7, No. 1, pp. 3–36, 2021. [Online]. Available: http://doi.org/10.1007/s41095-020-0191-7

J. Yang, Y. K. Wang, X. Yao, and C. T. Lin, “Adaptive Initialization Method for K-Means Algorithm,” Frontiers in Artificial Intelligence, Vol. 4, pp. 1–13, Nov. 2021. [Online]. Available: https://doi.org/10.3389/frai.2021.740817

J. Yang and S. Rahardja, “Dimensional Outlier Detection,” 2024. [Online]. Available: https://doi.org/10.36227/techrxiv.171172948.85055456/v1

A. Kumar, R. Saini, and R. Kumar, “A Comparative Analysis of Machine Learning Algorithms for Breast Cancer Detection and Identification of Key Predictive Features,” Traitement du Signal, Vol. 41, No. 1, pp. 127–140, 2024. [Online]. Available: https://doi.org/10.18280/ts.410110

V. Gallego, A. Freixes, and J. Lingan, “Applying Machine Learning in Marketing: An Analysis using the NMF and K-Means Algorithms,” Lecture Notes in Computer Science, Vol. 14778, pp. 14–26, 2024. [Online]. Available: https://doi.org/10.3390/info15070368

A. Wasilewski, “Customer Segmentation in E-Commerce: A Context-Aware Quality Framework for Comparing Clustering Algorithms,” Journal of Internet Services and Applications, Vol. 15, No. 1, pp. 160–178, 2024. [Online]. Available: https://doi.org/10.5753/jisa.2024.3851

A. Takie, E. Selmi, M. F. Zerarka, and A. Cheriet, “Enhancing K-Means Clustering with Post-Redistribution,” Ingénierie des Systèmes d’Information, Vol. 29, No. 2, pp. 429–436, 2024. [Online]. Available: https://doi.org/10.18280/isi.290204

N. M. Nhat, “Applied Density-based Clustering Techniques for Classifying High-Risk Customers: A Case Study of Commercial Banks in Vietnam,” Journal of Applied Data Science, Vol. 5, No. 4, pp. 1639–1653, 2024. [Online]. Available: https://doi.org/10.47738/jads.v5i4.344

B. I. Nugroho, A. Rafhina, P. S. Ananda, and G. Gunawan, “Customer Segmentation in Sales Transaction Data using K-Means Clustering Algorithm,” Journal of Intelligent Decision Support System, Vol. 7, No. 2, pp. 130–136, 2024. [Online]. Available: https://doi.org/10.35335/idss.v7i2.236

S. Grünewälder, “Machine Learning: What is Machine Learning?,” in Proc. 2019 3rd International Conference on I-SMAC (IoT in Social, Mobile, Analytics and Cloud), Vol. 11, No. 113, pp. 13–21, 2015. [Online]. Available: https://doi.org/10.28945/5086

A. Saxena, A. Agarwal, B. K. Pandey, and D. Pandey, “Examination of the Criticality of Customer Segmentation using Unsupervised Learning Methods,” Circular Economy and Sustainability, Vol. 4, No. 2, pp. 1447–1460, 2024. [Online]. Available: https://doi.org/10.1007/s43615-023-00336-4

S. Hidayati, A. T. Darmaliana, and R. Riski, “Comparison of K-Means, Fuzzy C-Means, Fuzzy Gustafson Kessel, and DBSCAN for Village Grouping in Surabaya based on Poverty Indicators,” Jurnal Pendidikan Matematika, Vol. 5, No. 2, p. 185, 2022. [Online]. Available: https://doi.org/10.1007/s43615-023-00336-4

C. H. Ardana, A. A. A. A. A. Khoyum, and M. Faisal, “Segmentasi Pelanggan Penjualan Online menggunakan Metode K-Means Clustering,” JISKA (Jurnal Informatika Sunan Kalijaga), Vol. 9, No. 1, pp. 1–9, 2024. [Online]. Available: https://doi.org/10.14421/jiska.2024.9.1.1-9

E. B. Firmansyah, M. R. Machado, and J. L. R. Moreira, “How can Artificial Intelligence (AI) be used to Manage Customer Lifetime Value (CLV)—A Systematic Literature Review,” International Journal of Information Management Data Insights, Vol. 4, No. 2, p. 100279, 2024. [Online]. Available: https://doi.org/10.1016/j.ijimei.2024.100279




DOI: https://doi.org/10.32520/stmsi.v14i6.5573

Article Metrics

Abstract view : 5 times
PDF - 1 times

Refbacks

  • There are currently no refbacks.


Creative Commons License
This work is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License.