Beyond Connection: Combining Multiple Data Sources to Understand and Predict Residential Internet Dropout
Abstract
User retention is an increasing concern among residential Internet service providers due to high competition. This paper proposes to leverage data sources of a multinational telecommunications company for the creation of machine learning models aimed at predicting customer churn. An initial analysis of the data is performed, and different classification models are compared, achieving promising results. The most influential characteristics of a customer’s decision to leave are also identified, enabling the use of the proposed solution in strategies to mitigate the problem of customer churn.
References
Baeza-Yates, R. and Ribeiro-Neto, B. (2011). Modern Information Retrieval: The Concepts and Technology behind Search. Addison-Wesley Publishing Company, USA, 2nd edition.
Bertaglia, T. F. C. and Nunes, M. d. G. V. (2016). Exploring word embeddings for unsupervised textual user-generated content normalization. In Proceedings of the 2nd Workshop on Noisy User-generated Text (WNUT), pages 112–120.
Bhuse, P., Gandhi, A., Meswani, P., Muni, R., and Katre, N. (2020). Machine learning based telecom-customer churn prediction. In 2020 3rd International Conference on Intelligent Sustainable Systems (ICISS), pages 1297–1301.
Bilal, F., Syed, Almazroi, A., Abdulwahab, Bashir, Saba, Khan, H., Farhan, Almazroi, A., and Abdulaleem (2022). An ensemble based approach using a combination of clustering and classification algorithms to enhance customer churn prediction in telecom industry. PeerJ Computer Science, 8:e854.
Caigny, A. D., Coussement, K., Bock, K. W. D., and Lessmann, S. (2020). Incorporating textual information in customer churn prediction models based on a convolutional neural network. International Journal of Forecasting, 36(4):1563–1578.
Choudhari, A. S. and Potey, M. (2018). Predictive to prescriptive analysis for customer churn in telecom industry using hybrid data mining techniques. In 2018 Fourth International Conference on Computing Communication Control and Automation (ICCUBEA), pages 1–6.
Lalwani, P., Mishra, M.K., Chadha, and et al., J. (2022). Customer churn prediction system: a machine learning approach. Computing, 104(271–294).
Lu, N., Lin, H., Lu, J., and Zhang, G. (2014). A customer churn prediction model in telecom industry using boosting. IEEE Transactions on Industrial Informatics, 10(2):1659.
Lundberg, S. M. and Lee, S.-I. (2017). A unified approach to interpreting model predictions. Advances in neural information processing systems, 30.
Micci-Barreca, D. (2001). A preprocessing scheme for high-cardinality categorical attributes in classification and prediction problems. SIGKDD Explor. Newsl.
Pimentel, T. P. and Goldschmidt, R. R. (2019). Sequential sentiment pattern mining to predict churn in crm systems: A case study with telecom data. In Proceedings of the XV Brazilian Symposium on Information Systems (SBSI’19), pages Article 11, 1–8, New York, NY, USA. Association for Computing Machinery.
Slof, D., Frasincar, F., and Matsiiako, V. (2021). A competing risks model based on latent dirichlet allocation for predicting churn reasons. Decision Support Systems, 146:113541.
Souza, F., Nogueira, R., and Lotufo, R. (2020). BERTimbau: pretrained BERT models for Brazilian Portuguese. In 9th Brazilian Conference on Intelligent Systems, BRACIS, Rio Grande do Sul, Brazil, October 20-23 (to appear).
Stehani, S., Karunya, N., Ranjan, D. R. J. B., Sumathipala, S., and Sandanayake, T. C. (2020). Customer churn reasoning in telecommunication domain. In 2020 International Conference on Image Processing and Robotics (ICIP), pages 1–5.
Ullah, I., Raza, B., Malik, A. K., Imran, M., Islam, S. U., and Kim, S. W. (2019). A churn prediction model using random forest: Analysis of machine learning techniques for churn prediction and factor identification in telecom sector. IEEE Access, 7:60134–60149.
Wu, S., Yau, W. C., Ong, T. S., and Chong, S. C. (2021). Integrated churn prediction and customer segmentation framework for telco business. IEEE Access, 9:62118–62136.
Yucesan, M., Edwine, N., Wang, W., Song, W., and Ssebuggwawo, D. (2022). Detecting the risk of customer churn in telecom sector: A comparative study. Mathematical Problems in Engineering, 2022:8534739.
Zaki, M. J. and Jr, W. M. (2014). Data Mining and Analysis: Fundamental Concepts and Algorithms. Cambridge University Press, USA.
Óskarsdóttir, M., Bravo, C., Verbeke, W., Sarraute, C., Baesens, B., and Vanthienen, J. (2017). Social network analytics for churn prediction in telco: Model building, evaluation and network architecture. Expert Systems With Applications, 85:204–220.
Özköse, Y. E., Haznedaroğlu, A., and Arslan, L. M. (2021). Customer churn analysis with deep learning methods on unstructured data. In 2021 Innovations in Intelligent Systems and Applications Conference (ASYU), pages 1–5.
