A Comparative Study on Customer Churn Analysis Using Machine Learning and Data Enrichment Techniques
DOI:
https://doi.org/10.31181/jscda21202441Keywords:
Churn Analysis , Data Mining, Customer Relationship Management, Machine Learning Algorithms, RFM AnalysisAbstract
With the increasing amount of online shopping, companies can collect more customer data. Companies use this data to get to know their customers better and provide customized services. Churn analysis is one of the most essential analyses derived from the vast amount of data collected, which provides information about when a customer will stop shopping with the company. In this study, we perform a churn analysis using machine learning (ML) algorithms to analyse the customer behavior data of a fashion retail company. To perform churn analysis, we performed a four-stage methodology. First, we carried out data preparation and visualization studies, and then we created models using various ML algorithms. After examining the baseline data, we added the RFM (Recency, Frequency, Monetary) score to the data with the data enrichment technique and performed the analysis again. We used the Synthetic Minority Oversampling Technique (SMOTE) to eliminate the data irregularity and performed parameter optimization on the algorithms in SMOTE data. We compared the accuracy and F1 score values obtained after this four-stage process and examined the effect of the algorithms. In the last stage, we divided whole data into clusters using the k-means technique and applied ML algorithms to clustered data. Then, we compared all these results and examined the effect of segmentation on the results. The analysis shows that the extreme gradient boosting algorithm provides better accuracy and F1 score values. Using these results, the company can identify customers likely to churn and begin funding Customer Relationship Management (CRM) efforts. Additionally, experts can determine the company's development directions by organizing campaigns for these customers and analysing their reasons for churn in more detail.
Downloads
References
Wagh, S. K., Andhale, A. A., Wagh, K. S., Pansare, J. R., Ambadekar, S. P., & Gawande, S. H. (2024). Customer churn prediction in telecom sector using machine learning techniques. Results in Control and Optimization, 14(October 2023), 100342. https://doi.org/10.1016/j.rico.2023.100342
Gil-Gomez, H., Guerola-Navarro, V., Oltra-Badenes, R., & Lozano-Quilis, J. A. (2020). Customer relationship management: digital transformation and sustainable business model innovation. Economic Research-Ekonomska Istrazivanja , 33(1), 2733–2750. https://doi.org/10.1080/1331677X.2019.1676283
Matuszelański, K., & Kopczewska, K. (2022). Customer Churn in Retail E-Commerce Business: Spatial and Machine Learning Approach. Journal of Theoretical and Applied Electronic Commerce Research, 17(1), 165–198. https://doi.org/10.3390/jtaer17010009
Kaynar, O., Tuna, M. F., Görmez, Y., & Deveci, M. A. (2017). Customer Churn Analysis Using Machine Learning Methods. C.U. Journal of Economics and Administrative Sciences, 18(1), 1–14.
Hamdy, I., & Kandel, A. (2018). A Comparative Study of Tree-Based Models for Churn Prediction: A Case Study in the Telecommunication Sector. NOVA Information Management School, 56.
Celik, O., & Osmanoglu, U. O. (2019). Comparing to Techniques Used in Customer Churn Analysis. Journal of Multidisciplinary Developments, 4(1), 30–38.
Cooper, H. (2020). Comparison of Classification Algorithms and Undersampling Methods on Employee Churn Prediction: A Case Study of a Tech Company (Issue December) [Faculty of California Polytechnic State University, San Luis Obispo]. https://digitalcommons.calpoly.edu/cgi/viewcontent.cgi?article=3753&context=theses
Wu, S., Yau, W. C., Ong, T. S., & Chong, S. C. (2021). Integrated Churn Prediction and Customer Segmentation Framework for Telco Business. IEEE Access, 9, 62118–62136. https://doi.org/10.1109/ACCESS.2021.3073776
Dingli, A., Marmara, V., & Fournier, N. S. (2017). Comparison of deep learning algorithms to predict customer churn within a local retail industry. International Journal of Machine Learning and Computing, 7(5), 128–132. https://doi.org/10.18178/ijmlc.2017.7.5.634
Asthana, P. (2018). A comparison of machine learning techniques for customer churn prediction. International Journal of Pure and Applied Mathematics, 119(10), 1149–1169. https://acadpubl.eu/jsi/2018-119-10/articles/10b/2.pdf
Aleksandrova, Y. (2018). Application of machine learning for churn prediction based on transactional data (RFM analysis). International Multidisciplinary Scientific GeoConference Surveying Geology and Mining Ecology Management, SGEM, 18(2.1), 125–132. https://doi.org/10.5593/sgem2018/2.1/S07.016
Ahmad Naz, N., Shoaib, U., & Shahzad Sarfraz, M. (2018). A Review on Customer Churn Prediction Data Mining Modeling Techniques. Indian Journal of Science and Technology, 11(27), 1–7. https://doi.org/10.17485/ijst/2018/v11i27/121478
Stucki, O. (2019). Predicting the customer churn with machine learning methods - CASE: private insurance customer data [School of Business and Management Lappeenranta-Lahti University of Technology LUT]. https://lutpub.lut.fi/bitstream/handle/10024/160081/Thesis_Oskar_stucki.pdf?sequence=1&isAllowed=y
Wadikar, D. (2020). Customer Churn Prediction [Technological University Dublin]. https://doi.org/10.17148/iarjset.2021.8692
Makruf, M., Bramantoro, A., Alyamani, H. J., Alesawi, S., & Alturki, R. (2021). Classification methods comparison for customer churn prediction in the telecommunication industry. International Journal of Advanced and Applied Sciences, 8(12), 1–8. https://doi.org/10.21833/ijaas.2021.12.001
Patel, P. C., Struckell, E. M., Ojha, D., & Manikas, A. S. (2020). Retail store churn and performance – The moderating role of sales amplitude and unpredictability. International Journal of Production Economics, 222(May 2019). https://doi.org/10.1016/j.ijpe.2019.09.031
Geiler, L., Affeldt, S., & Nadif, M. (2022). An effective strategy for churn prediction and customer profiling. Data and Knowledge Engineering, 142(August), 102100. https://doi.org/10.1016/j.datak.2022.102100
Baghla, S., & Gupta, G. (2022). Performance Evaluation of Various Classification Techniques for Customer Churn Prediction in E-commerce. Microprocessors and Microsystems, 94, 104680. https://doi.org/10.1016/j.micpro.2022.104680
Prabadevi, B., Shalini, R., & Kavitha, B. R. (2023). Customer churning analysis using machine learning algorithms. International Journal of Intelligent Networks, 4(May), 145–154. https://doi.org/10.1016/j.ijin.2023.05.005
Shobana, J., Gangadhar, C., Arora, R. K., Renjith, P. N., Bamini, J., & Chincholkar, Y. devidas. (2023). E-commerce customer churn prevention using machine learning-based business intelligence strategy. Measurement: Sensors, 27(December 2022), 100728. https://doi.org/10.1016/j.measen.2023.100728
Haddadi, S. J., Farshidvard, A., Silva, F. dos S., dos Reis, J. C., & da Silva Reis, M. (2024). Customer churn prediction in imbalanced datasets with resampling methods: A comparative study. Expert Systems with Applications, 246(September 2023), 123086. https://doi.org/10.1016/j.eswa.2023.123086
Singh, P. P., Anik, F. I., Senapati, R., Sinha, A., Sakib, N., & Hossain, E. (2024). Investigating customer churn in banking: A machine learning approach and visualization app for data science and management. Data Science and Management, 7(1), 7–16. https://doi.org/10.1016/j.dsm.2023.09.002
Fávero, L. P., Belfiore, P., & de Freitas Souza, R. (2023). Chapter 21 - Random forests. In L. P. Fávero, P. Belfiore, & R. de Freitas Souza (Eds.), Data Science, Analytics and Machine Learning with R (pp. 429–440). Academic Press. https://doi.org/https://doi.org/10.1016/B978-0-12-824271-1.00018-4
Mitchell, R., & Frank, E. (2017). Accelerating the XGBoost algorithm using GPU computing. PeerJ Computer Science, 2017(7). https://doi.org/10.7717/peerj-cs.127
Dong, Q., Chen, X., & Huang, B. (2024). Chapter 5 - Logistic regression. In Q. Dong, X. Chen, & B. Huang (Eds.), Data Analysis in Pavement Engineering (pp. 141–152). Elsevier. https://doi.org/https://doi.org/10.1016/B978-0-443-15928-2.00001-X
Wibawa, A. P., Kurniawan, A. C., Murti, D. M. P., Adiperkasa, R. P., Putra, S. M., Kurniawan, S. A., & Nugraha, Y. R. (2019). Naïve Bayes Classifier for Journal Quartile Classification. International Journal of Recent Contributions from Engineering, Science & IT (IJES), 7(2), 91. https://doi.org/10.3991/ijes.v7i2.10659
Downloads
Published
Issue
Section
License
Copyright (c) 2024 Journal of Soft Computing and Decision Analytics
This work is licensed under a Creative Commons Attribution 4.0 International License.