Assessor Models with Reject Option for soccer result prediction
Abstract
Soccer is a widely popular sport, both in Brazil and around the world, with a billion-dollar industry surrounding it. The use of data and Machine Learning (ML) algorithms has been explored as a tool to predict outcomes in this sport. However, the unpredictability of soccer makes it challenging to obtain accurate and reliable predictions. In this study, we propose to use a ML model called assessor, which analyzes the predictions returned by a classifier of match outcomes in order to select those ones with the highest reliability, discarding the others. We seek to optimize the relationship between the accuracy of accepted predictions and rejection rate, in order to maximize the reliability of the model adopted for match outcomes. We performed experiments with real data, identifying the championships, teams and rounds in which the proposed model presents the best performance. This innovative approach contributes to the improvement of soccer result predictions, using advanced ML techniques together with the selection of high-quality predictions.
References
Chow, C. (1970). On optimum recognition error and reject tradeoff. IEEE Transactions on information theory, 16(1):41–46.
Constantinou, A. C. (2019). Dolores: a model that predicts football match outcomes from all over the world. Machine Learning, 108(1):49–75.
da Rocha Neto, A. R., Sousa, R., de A. Barreto, G., e Cardoso, J. S. (2011). Diagnostic of pathology on the vertebral column with embedded reject option. In Pattern Recognition and Image Analysis: 5th Iberian Conference, IbPRIA 2011, Las Palmas de Gran Canaria, Spain, June 8-10, 2011. Proceedings 5, pages 588–595. Springer.
Deloitte, U. (2020). Deloitte football money league.
FIFA (2018). More than half the world watched record-breaking 2018 world cup. FIFA.
Geifman, Y. e El-Yaniv, R. (2017). Selective classification for deep neural networks. Advances in neural information processing systems, 30.
Godin, F., Zuallaert, J., Vandersmissen, B., De Neve, W., e Van de Walle, R. (2014). Beating the bookmakers: leveraging statistics and twitter microposts for predicting soccer results. In KDD Workshop on large-scale sports analytics, pages 2–14. ACM New York, NY, USA.
Hendrickx, K., Perini, L., Van der Plas, D., Meert, W., e Davis, J. (2021). Machine learning with a reject option: A survey. arXiv preprint arXiv:2107.11277.
Hernández-Orallo, J., Schellaert, W., e Martínez-Plumed, F. (2022). Training on the test set: Mapping the system-problem space in ai. In Proceedings of the AAAI Conference on Artificial Intelligence, volume 36, pages 12256–12261.
Hubáček, O., Šourek, G., e Železný, F. (2019). Learning to predict soccer results from relational data with gradient boosted trees. Machine Learning, 108:29–47.
Hucaljuk, J. e Rakipović, A. (2011). Predicting football scores using machine learning techniques. In 2011 Proceedings of the 34th International Convention MIPRO, pages 1623–1627. IEEE.
Hüllermeier, E. e Waegeman, W. (2021). Aleatoric and epistemic uncertainty in machine learning: An introduction to concepts and methods. Machine Learning, 110:457–506.
Jiang, H., Kim, B., Guan, M., e Gupta, M. (2018). To trust or not to trust a classifier. Advances in neural information processing systems, 31.
Nicora, G., Rios, M., Abu-Hanna, A., e Bellazzi, R. (2022). Evaluating pointwise reliability of machine learning prediction. Journal of Biomedical Informatics, 127:103996.
Pappalardo, L., Cintia, P., Rossi, A., Massucco, E., Ferragina, P., Pedreschi, D., e Giannotti, F. (2019). A public data set of spatio-temporal match events in soccer competitions. Scientific data, 6(1):236.
Partida, A., Martinez, A., Durrer, C., Gutierrez, O., e Posta, F. (2021). Modeling of football match outcomes with expected goals statistic. Journal of Student Research, 10(1).
Rossi, A., Pappalardo, L., Cintia, P., Iaia, F. M., Fernández, J., e Medina, D. (2018). Effective injury forecasting in soccer with gps training data and machine learning. PloS one, 13(7):e0201264.
Stübinger, J., Mangold, B., e Knoll, J. (2019). Machine learning in football betting: Prediction of match results based on player characteristics. Applied Sciences, 10(1):46.
Tax, N. e Joustra, Y. (2015). Predicting the dutch football competition using public data: A machine learning approach. Transactions on knowledge and data engineering, 10(10):1–13.
Zhou, L., Martinez-Plumed, F., Hernández-Orallo, J., Ferri, C., e Schellaert, W. (2022). Reject before you run: Small assessors anticipate big language models. In Proceedings of the EBeM22,IJCAI Workshop on AI Evaluation Beyond Metrics Intelligence.
