Relevance of Problem Domain Understanding in the Construction of Computational Learning Models

Abstract


The objective of this work is to confirm the relevance of prior understanding of the problem domain for data science projects, specifically for building learning models. As case studies we will consider three problem domains in the health area, and as the main source of data, we will consider the recent National Health Survey, PNS 2019 prepared by IBGE. The experiments show that prior understanding of the problem domain, and its representation through conceptual models, are useful for applying a conceptual attribute selection process in the search for more assertive learning models.

Keywords: Entendimento do domínio do problema, Conhecimento de domínio em ciência de dados, Modelos conceituais, Processo de seleção de atributos, Modelos de aprendizado, Projetos de ciência de dados, Análise de dados de saúde, Pesquisa Nacional de Saúde (PNS 2019), Aprendizado de máquina na saúde, Seleção conceitual de atributos, Precisão do modelo, Relevância da expertise em domínio, Mineração de dados de saúde, Representação do conhecimento, Tomada de decisão orientada por dados

References

Guyon, I.; et. al. Analysis of the AutoMl Challenge series 2015-2018. Frank Hutter; Lars Kotthoff; Joaquin Vanschoren (eds). AutoML: Methods, Systems, Challenges, Springer Verlag, In: press, The Springer Series on Challenges in Machine Learning. 2019. DOI: 10.1007/978-3-030-05318-5_10

Ribeiro, C. E.; Zárate, L. E. Classifying longevity profiles through longitudinal data mining, Expert Systems with Applications, v. 117, p. 75-89, 2019. DOI: 10.1016/j.eswa.2018.09.035

Araújo, A. S.; Silva, A. R.; Zárate, L .E. Extreme precipitation prediction based on neural network model – A case study for southeastern Brazil, Journal of Hydrology, V. 606, 127454 2022. DOI: 10.1016/j.jhydrol.2022.127454.

Zarate, L., Petrocchi , B. ., Dias Maia, C. ., Felix, C., & Gomes, M. P. . (2023). CAPTO - A method for understanding problem domains for data science projects: CAPTO - Um método para entendimento de domínio de problema para projetos em ciência de dados. Concilium, 23(15), 922–941. DOI: 10.53660/CLM-1815-23M33.

Teece, D.J. (2013). Nonaka’s Contribution to the Understanding of Knowledge Creation, Codification and Capture. In: von Krogh, G., Takeuchi, H., Kase, K., Cantón, C.G. (eds) Towards Organizational Knowledge. The Nonaka Series on Knowledge and Innovation. Palgrave Macmillan, London. DOI: 10.1057/9781137024961_2

Brady SS, Brubaker L, Fok CS, et al. Development of Conceptual Models to Guide Public Health Research, Practice, and Policy: Synthesizing Traditional and Contemporary Paradigms. Health Promot Pract. 2020;21(4):510-524. DOI: 10.1177/1524839919890869

Sally C. Brailsford, Tillal Eldabi, Martin Kunc, Navonil Mustafee, Andres F. Osorio, Hybrid simulation modelling in operational research: A state-of-the-art review, European Journal of Operational Research, Volume 278, Issue 3, 2019, Pages 721-737, ISSN 0377-2217, DOI: 10.1016/j.ejor.2018.10.025.
Published
2024-10-14
GONÇALVES, Ligia F. de Carvalho; FRANCA, Daniel Rocha; ZARATE, Luis Enrique. Relevance of Problem Domain Understanding in the Construction of Computational Learning Models. In: BRAZILIAN E-SCIENCE WORKSHOP (BRESCI), 18. , 2024, Florianópolis/SC. Anais [...]. Porto Alegre: Sociedade Brasileira de Computação, 2024 . p. 135-142. ISSN 2763-8774. DOI: https://doi.org/10.5753/bresci.2024.240233.