Using Machine Learning to identify profiles of individuals with depression

  • Carlos D. Maia Pontifícia Universidade Católica de Minas Gerais
  • Cristiane N. Nobre Pontifícia Universidade Católica de Minas Gerais
  • Marco Paulo S. Gomes Pontifícia Universidade Católica de Minas Gerais
  • Luis E. Zárate Pontifícia Universidade Católica de Minas Gerais


Depression is a major public health problem in Brazil, affecting millions of individuals each year. While the prevalence of depression in Brazil has been well-documented, there is still a need for more accurate and timely predictions of depression trends to improve treatment and prevention strategies. In this study, we explored the potential of machine learning algorithms to forecast depression trends in Brazil using data from the National Health Survey conducted by the Brazilian Institute of Geography and Statistics. We compared the performance of various machine learning models in depression trends, including decision trees, random forests, support vector machines, and neural networks. Additionally, we aimed to identify key risk factors for depression trends in Brazil, including age, gender, income, education, and marital status. These findings have important implications for public health policies and mental healthcare in Brazil. Our study provides insights into the use of machine learning algorithms to predict and prevent depression trends and highlights the potential of data-driven approaches to improve mental health outcomes in Brazil.

Palavras-chave: Machine Learning, Health, Depression


Albert, P. R. Why is depression more prevalent in women? Journal of psychiatry & neuroscience: JPN 40 (4): 219, 2015.

Alpaydin, E. Introduction to Machine Learning. Adaptive Computation and Machine Learning. MIT Press, Cambridge, MA, 2014.

Blasco, B., García-Jiménez, J., Bodoano, I., and Gutiérrez-Rojas, L. Obesity and depression: Its prevalence and influence as a prognostic factor: A systematic review. Psychiatry investigation vol. 17, 08, 2020.

BRASIL. Depressão, 2023.

Instituto Brasileiro de Geografia e Estatística. Pesuisa Nacional de Saúde. [link].

Kim, H., Yoo, J., Han, K., Fava, M., Mischoulon, D., Park, M. J., and Jeon, H. J. Associations between smoking, alcohol consumption, physical activity and depression in middle-aged premenopausal and postmenopausal women. Frontiers in Psychiatry vol. 12, pp. 2437, 2021.

Lal, G. R., Chen, X., and Mithal, V. Te2rules: Extracting rule lists from tree ensembles. arXiv preprint arXiv:2206.14359 , 2022.

Lane, M. M., Gamage, E., O’Neil, A., Jacka, F., Marx, W., Dissanayaka, T., Ashtree, D., Travica, N., Gauci, S., and Lotfalian, M. Ultra-processed food consumption and mental health: A systematic review and meta-analysis of observational studies. Nutrients vol. 14, pp. 2568, 06, 2022.

Ljungberg, T., Bondza, E., and Lethin, C. Evidence of the importance of dietary habits regarding depressive symptoms and depression. International Journal of Environmental Research and Public Health vol. 17, pp. 1616, 03, 2020.

McHugh, R. Alcohol use disorder and depressive disorders. Alcohol Research: Current Reviews vol. 40, 10, 2019.

Na, K.-S., Cho, S.-E., Geem, Z. W., and Kim, Y.-K. Predicting future onset of depression among community dwelling adults in the republic of korea using a machine learning algorithm. Neuroscience Letters vol. 721, pp. 134804, 01, 2020.

Noh, J.-W., Kwon, Y. D., Park, J., Oh, I.-H., and Kim, J. Relationship between physical disability and depression by gender: a panel regression model. PloS one 11 (11): e0166238, 2016.

Richter, T., Fishbain, B., Markus, A., Richter-Levin, G., and Okon-Singer, H. Using machine learning-based analysis for behavioral differentiation between anxiety and depression. Scientific Reports vol. 10, 10, 2020.

Schmidt-Hieber, J. The kolmogorov–arnold representation theorem revisited. Neural Networks vol. 137, pp. 119–126, 2021.

Schonfeld, I. and Bianchi, R. From burnout to occupational depression: Recent developments in research on job-related distress and occupational health. Frontiers in Public Health vol. 9, pp. 1–6, 12, 2021.

Sharma, A. and Verbeke, W. J. M. I. Improving diagnosis of depression with xgboost machine learning model and a large biomarkers dutch dataset (n = 11,081). Frontiers in Big Data vol. 3, 2020.

Woody, C., Ferrari, A., Siskind, D., Whiteford, H., and Harris, M. A systematic review and meta-regression of the prevalence and incidence of perinatal depression. Journal of Affective Disorders vol. 219, pp. 86–92, 2017.
MAIA, Carlos D.; NOBRE, Cristiane N.; GOMES, Marco Paulo S.; ZÁRATE, Luis E.. Using Machine Learning to identify profiles of individuals with depression. In: SYMPOSIUM ON KNOWLEDGE DISCOVERY, MINING AND LEARNING (KDMILE), 11. , 2023, Belo Horizonte/MG. Anais [...]. Porto Alegre: Sociedade Brasileira de Computação, 2023 . p. 105-112. ISSN 2763-8944. DOI: