Knowledge Discovery in Brazilian Soccer Championship Scout Data
Abstract
Sports analytics, also known as scout, has gained significant attention thanks to the positive results of its application in a wide variety of sports. Due to the popularity of football, several studies seek to apply Machine Learning on scout data in this sport due to the popularity of soccer. This type of data comprises events countings, such as passes and finishes, and can assist the technical staffs' decision-making process. However, these efforts are focused on European football. In this work, we investigate the knowledge discovery using Machine Learning algorithms in data from scout obtained in Brazilian soccer. With that, we show the current potential and limitations of this approach.
References
Arndt, C. and Brefeld, U. (2016). Predicting the future performance of soccer players. Statistical Analysis and Data Mining: The ASA Data Science Journal, 9(5):373–382.
Berrar, D., Lopes, P., Davis, J., and Dubitzky, W. (2019). Guest editorial: special issue on machine learning for soccer. Machine Learning, 108(1):1–7.
Brefeld, U. and Zimmermann, A. (2017). Guest editorial: Special issue on sports analytics. Data Mining and Knowledge Discovery, 31(6):1577–1579.
Mota, E., Coimbra, D., and Peixoto, M. (2018). Cartola fc data analysis: A simulation, analysis, and visualization tool based on cartola fc fantasy game. In Proceedings of the XIV Brazilian Symposium on Information Systems, pages 1–8.
Pappalardo, L., Cintia, P., Rossi, A., Massucco, E., Ferragina, P., Pedreschi, D., and Giannotti, F. (2019). A public data set of spatio-temporal match events in soccer competitions. Scientific data, 6(1):1–15.
Pedregosa, F., Varoquaux, G., Gramfort, A., Michel, V., Thirion, B., Grisel, O., Blondel, M., Prettenhofer, P., Weiss, R., Dubourg, V., et al. (2011). Scikit-learn: Machine learning in python. the Journal of machine Learning research, 12:2825–2830.
Pollard, R. (1986). Home advantage in soccer: A retrospective analysis. Journal of sports sciences, 4(3):237–248.
Rossi, A., Pappalardo, L., Cintia, P., Iaia, F. M., Fernández, J., and Medina, D. (2018). Effective injury forecasting in soccer with gps training data and machine learning. PloS one, 13(7):e0201264.
Santos, J. M. A. d. (2019). Previsões de resultados em partidas do campeonato brasileiro de futebol. PhD thesis, Fundação Getúlio Vargas.
Schumaker, R. P., Solieman, O. K., and Chen, H. (2010). Sports data mining, volume 26. Springer Science & Business Media.
Alamar, B. C. (2013). Sports analytics: A guide for coaches, managers, and other decision makers. Columbia University Press.
Arndt, C. and Brefeld, U. (2016). Predicting the future performance of soccer players. Statistical Analysis and Data Mining: The ASA Data Science Journal, 9(5):373–382.
Berrar, D., Lopes, P., Davis, J., and Dubitzky, W. (2019). Guest editorial: special issue on machine learning for soccer. Machine Learning, 108(1):1–7.
Brefeld, U. and Zimmermann, A. (2017). Guest editorial: Special issue on sports analytics. Data Mining and Knowledge Discovery, 31(6):1577–1579.
Mota, E., Coimbra, D., and Peixoto, M. (2018). Cartola fc data analysis: A simulation, analysis, and visualization tool based on cartola fc fantasy game. In Proceedings of the XIV Brazilian Symposium on Information Systems, pages 1–8.
Pappalardo, L., Cintia, P., Rossi, A., Massucco, E., Ferragina, P., Pedreschi, D., and Giannotti, F. (2019). A public data set of spatio-temporal match events in soccer competitions. Scientific data, 6(1):1–15.
Pedregosa, F., Varoquaux, G., Gramfort, A., Michel, V., Thirion, B., Grisel, O., Blondel, M., Prettenhofer, P., Weiss, R., Dubourg, V., et al. (2011). Scikit-learn: Machine learning in python. the Journal of machine Learning research, 12:2825–2830.
Pollard, R. (1986). Home advantage in soccer: A retrospective analysis. Journal of sports sciences, 4(3):237–248.
Rossi, A., Pappalardo, L., Cintia, P., Iaia, F. M., Fernández, J., and Medina, D. (2018). Effective injury forecasting in soccer with gps training data and machine learning. PloS one, 13(7):e0201264.
Santos, J. M. A. d. (2019). Previsões de resultados em partidas do campeonato brasileiro de futebol. PhD thesis, Fundação Getúlio Vargas.
Schumaker, R. P., Solieman, O. K., and Chen, H. (2010). Sports data mining, volume 26. Springer Science & Business Media.
