Application of Machine Learning in the Diagnosis of Peste des Petits Ruminants: A Comparative Approach Between Classification and Clustering

  • Rafael L. Araújo IFPI / UFPI
  • Vitor R. F. Da Silva IFPI
  • Francisco E. Santos IFPI
  • Anthony I. M. Luz UFPI
  • Romuere R. V. e Silva UFPI

Abstract


Peste des Petits Ruminants (PPR) is an infectious disease affecting goats and sheep, causing significant economic and health impacts. Traditional diagnostic methods, such as RT-qPCR, are accurate but require time and specialized infrastructure. This study evaluates the use of Machine Learning techniques to support the diagnosis of PPR based on clinical data. Classification and clustering models were applied to identify patterns associated with the presence of the disease. Gradient Boosting achieved the best predictive performance, while clustering analysis revealed relevant structures in the dataset. The findings suggest the potential of these approaches to support the detection and monitoring of PPR.

References

Banyard, A. C., Parida, S., Batten, C., Oura, C., Kwiatek, O., and Libeau, G. (2010). Global distribution of peste des petits ruminants virus and prospects for improved diagnosis and control. Journal of General Virology, 91(12):2885–2897.

Breiman, L. (2001). Random forests. Machine learning, 45(1):5–32.

Cortes, C. and Vapnik, V. (1995). Support-vector networks. Machine learning, 20(3):273–297.

Cover, T. M. and Hart, P. E. (1967). Nearest neighbor pattern classification. IEEE transactions on information theory, 13(1):21–27.

Cox, D. R. (1958). The regression analysis of binary sequences. Journal of the Royal Statistical Society: Series B (Methodological), 20(2):215–232.

Friedman, J. H. (2001). Greedy function approximation: a gradient boosting machine. Annals of statistics, pages 1189–1232.

Hubert, L. and Arabie, P. (1985). Comparing partitions. Journal of classification, 2(1):193–218.

Kodinariya, T. M. and Makwana, P. R. (2013). Review on determining number of cluster in k-means clustering. International Journal, 1(6):90–95.

Kohavi, R. et al. (1995). A study of cross-validation and bootstrap for accuracy estimation and model selection. In Ijcai, volume 14, pages 1137–1145. Montreal, Canada.

Kohli, V., Arora, A., and Dagar, P. (2017). A study on disease prediction using machine learning in healthcare. International Journal of Information Technology, 9:119–124.

MacQueen, J. (1967). Some methods for classification and analysis of multivariate observations. 1(14):281–297.

McCallum, A. and Nigam, K. (1998). A comparison of event models for naive bayes text classification. In AAAI-98 workshop on learning for text categorization, volume 752, pages 41–48.

Myagila, K. C., Sabas, J., and Mark, P. N. (2023). Multi-model ppr disease prediction using machine learning algorithms. In 2023 First International Conference on the Advancements of Artificial Intelligence in African Context (AAIAC), pages 1–6. IEEE.

Nguyen, D. C., Ding, M., Pathirana, P. N., and Seneviratne, A. (2020). Artificial intelligence in the battle against coronavirus (covid-19): a survey and future research directions. arXiv preprint arXiv:2008.07343.

Niu, B., Liang, R., Zhou, G., Zhang, Q., Su, Q., Qu, X., and Zhang, S. (2021). Prediction for global peste des petits ruminants outbreaks based on a combination of random forest algorithms and meteorological data. Frontiers in Veterinary Science, 7:570829.

Rousseeuw, P. J. (1987). Silhouettes: a graphical aid to the interpretation and validation of cluster analysis. Journal of computational and applied mathematics, 20:53–65.

Satopaa, V., Albrecht, J., Irwin, D., and Raghavan, B. (2011). Finding a ”kneedle”in a haystack: Detecting knee points in system behavior. pages 166–171.

Sobeih, M. M., Rahman, A. K. M. A., Islam, S. S., Sufian, M. A., Talukder, M. H., Ward, M. P., and Martínez-López, B. (2020). Peste des petits ruminants risk factors and space-time clusters in bangladesh. Frontiers in Veterinary Science, 7:572432.

Thanyambo, D. (2023). Ppr disease data from goats and sheep. [link]. Acesso em: abr. 2025.

Thorndike, R. L. (1953). Who belongs to the family? Psychometrika, 18(4):267–276.

Walsh, M. G., Haseeb, M. A., and Mor, S. M. (2021). Prediction for global peste des petits ruminants outbreaks based on a combination of random forest algorithms and meteorological data. Frontiers in Veterinary Science, 7:570829.

Xu, D. and Tian, Y. (2015). A comprehensive review of clustering algorithms. Annals of Data Science, 2(2):165–193.

Xu, L., Skoularidou, M., Cuesta-Infante, A., and Veeramachaneni, K. (2019). Modeling tabular data using conditional gan. arXiv preprint arXiv:1907.00503.
Published
2025-05-28
ARAÚJO, Rafael L.; SILVA, Vitor R. F. Da; SANTOS, Francisco E.; LUZ, Anthony I. M.; V. E SILVA, Romuere R.. Application of Machine Learning in the Diagnosis of Peste des Petits Ruminants: A Comparative Approach Between Classification and Clustering. In: UNIFIED COMPUTING MEETING OF PIAUÍ (ENUCOMPI), 17. , 2025, Teresina/PI. Anais [...]. Porto Alegre: Sociedade Brasileira de Computação, 2025 . p. 149-158. DOI: https://doi.org/10.5753/enucompi.2025.9782.