Analysis of Socioeconomic and Anthropometric Data via Tree-based Models: Evidence for Policies Against Hunger

  • João Gabriel Soares Ferreira UFC
  • César Lincoln Cavalcante Mattos UFC
  • Antonio Rafael Braga UFC
  • Danielo G. Gomes UFC

Abstract


Hunger and food insecurity are regional and global problems we still face, being intensified by the lack of broad access to data on these issues. However, socioeconomic data at the individual (or household) level are available and collected all over Brazil. In that sense, this work proposes to use machine learning models capable of predicting anthropometric indicators related to hunger, based on socioeconomic data. The data in question was automatically extracted from the CECAD platform (Query, Selection and Extraction of Information from CadÚnico). The anthropometric indicators (low weight for height, low weight for age and low height for age) were collected from SISVAN (Food and Nutrition Surveillance System). The experiments focused on decision tree-based models (Random Forest, Gradient Boosting, XGBoost, LightGBM and CatBoost). All Brazilian municipalities were used for model training, with the exception of those in the state of Ceará, which were separated for testing. The best models obtained promising results in the prediction task, especially for the low height-for-age indicator, where the percentage error reached 22%.

References

Aguiar, I. W. O., Carioca, A. A. F., Barbosa, B. B., Adriano, L. S., Barros, A. Q. S., Kendall, C., and Kerr, L. R. F. S. (2023). Indicadores antropométricos em povos e comunidades tradicionais do Brasil: Análise de registros individuais do sistema de vigilância alimentar e nutricional, 2019. Epidemiologia e Serviços de Saúde, 32:e2023543.

Barbosa, R. M. and Nelson, D. R. (2016). The use of support vector machine to analyze food security in a region of brazil. Applied Artificial Intelligence, 30(4):318–330.

Bentéjac, C., Csörgő, A., and Martínez-Muñoz, G. (2021). A comparative analysis of gradient boosting algorithms. Artificial Intelligence Review, 54:1937–1967.

Food and Agriculture Organization of the United Nations (2023). Putting a number on hunger – interactive presentation. Interactive web page, The State of Food Security and Nutrition in the World 2023. acessado em 05 agosto 2025.

Grinsztajn, L., Oyallon, E., and Varoquaux, G. (2022). Why do tree-based models still outperform deep learning on typical tabular data? Advances in Neural Information Processing Systems, 35:507–520.

Gubert, M. B., Benicio, M. H. D., and Monteiro, C. A. (2010). Estimativas de insegurança alimentar grave nos municípios brasileiros. Cadernos de Saúde Pública, 26(8):1595–1605.

Lobo, P. L. S., Santos, A. C., and Oliveira, M. R. (2024). Aplicação de algoritmos de aprendizado de máquina na análise da vulnerabilidade social e insegurança alimentar. In Anais do ERI-GO 2024, pages 158–167. Sociedade Brasileira de Computação.

Machefer, M., Thomas, A.-C., Meroni, M., Pena, J. M. V. L., Ronco, M., Corbane, C., and Rembold, F. (2025). Potential and limitations of machine learning modeling for forecasting acute food insecurity. Global Food Security, 45:100859.

Mienye, I. D. and Jere, N. (2024). A survey of decision trees: Concepts, algorithms, and applications. IEEE access.

Mindiyarti, N., Sartono, B., Indahwati, I., Hadi, A. F., and Ramadhani, E. (2023). A study in determining indicators of food-insecure households using SHAP and Boruta SHAP. In AIP Conference Proceedings, volume 2720. AIP Publishing.

Morais, D. d. C., Dutra, L. V., Franceschini, S. d. C. C., and Priore, S. E. (2014). Insegurança alimentar e indicadores antropométricos, dietéticos e sociais em estudos brasileiros: uma revisão sistemática. Ciência & Saúde Coletiva, 19:1475–1488.

Mrejen, M., Cruz, M. V., and Rosa, L. (2023). O sistema de vigilância alimentar e nutricional (sisvan) como ferramenta de monitoramento do estado nutricional de crianças e adolescentes no brasil. Cadernos de Saúde Pública, 39:e00169622.

Silva, N. d. J., Carrilho, T. R. B., Pinto, E. d. J., Andrade, R. d. C. S. d., Silva, S. A., Pedroso, J., Spaniol, A. M., Bortolini, G. A., Fagundes, A., Nilson, E. A. F., et al. (2023). Quality of child anthropometric data from sisvan, brazil, 2008-2017. Revista de Saúde Pública, 57:62.

Sousa, I. M. L. d. and Diniz, R. B. (2024). Controle da qualidade e segurança alimentar durante a pandemia por covid-19 nos setores públicos do Brasil. Nutrivisa - Revista de Nutrição e Vigilância em Saúde, 11(1):e12302.

Subianto, M., ULYA, I. Y., RAMADHANI, E., SARTONO, B., and HADI, A. F. (2023). Application of SHAP on CatBoost classification for identification of variabels characterizing food insecurity occurrences in aceh province households. Jurnal Natural, 23(3):230–244.

World Health Organization (2023). The State of Food Security and Nutrition in the World 2023: Urbanization, agrifood systems transformation and healthy diets across the rural–urban continuum, volume 2023. Food & Agriculture Org.
Published
2025-09-29
FERREIRA, João Gabriel Soares; MATTOS, César Lincoln Cavalcante; BRAGA, Antonio Rafael; GOMES, Danielo G.. Analysis of Socioeconomic and Anthropometric Data via Tree-based Models: Evidence for Policies Against Hunger. In: NATIONAL MEETING ON ARTIFICIAL AND COMPUTATIONAL INTELLIGENCE (ENIAC), 22. , 2025, Fortaleza/CE. Anais [...]. Porto Alegre: Sociedade Brasileira de Computação, 2025 . p. 1-12. ISSN 2763-9061. DOI: https://doi.org/10.5753/eniac.2025.10488.