Combining Data Mining Techniques to Analyse Factors Associated with Allocation of Socioeconomic Resources at IFMG


The granting of socioeconomic assistance to students from Federal Education Institutions is one of the ways found to provide finantial support during their studies, focusing primarily on those who are more socially vulnerable. Institutions carry out selection processes to identify students with a profile of demand and appropriately distribute the grants according to the budget available for this purpose. This article applied Data Mining techniques to a set of information from students who applied to receive scholarships at IFMG - Campus Bambuí, seeking to identify the attributes associated with the distribution of benefits and analyzing the adequacy of the current indicator used by the institution to classify the level of social vulnerability of students. The proposed methodology involved combining different machine learning algorithms, such as data classification and feature selection techniques. In addition to identifying the degree of importance of each attribute in the constructed model, the differential of this article is to present well-founded suggestions for new attributes that could be able to improve the index used by the institution and, consequently, optimize the workload of those involved with the analysis of selective processes. The composition of the institution's index with five new attributes resulted in a gain of around 10% in rating performance.

Palavras-chave: Data Mining, Automatic Classification, Feature Selection


Abdi, H. and Williams, L. J. Newman-keuls test and tukey test. Encyclopedia of research design, 2010.

BRASIL. Portal da transparência, 2021.

Carrano, D., de Albergaria, E. T., Infante, C., and Rocha, L. Combinando técnicas de mineração de dados para melhorar a detecção de indicadores de evasão universitária. In Brazilian Symposium on Computers in Education (Simpósio Brasileiro de Informática na Educação-SBIE). Vol. 30. pp. 1321, 2019.

El-Hasnony, I. M., Barakat, S. I., Elhoseny, M., and Mostafa, R. R. Improved feature selection model for big data analytics. IEEE Access vol. 8, pp. 66989?67004, 2020.

Melo, E. C., Silva, G. d., and Silva, P. C. L. d. Computerization of the student assistance scholarships selection process: the ifmg experience. Research, Society and Development 10 (1), 2021.

Omuya, E. O., Okeyo, G. O., and Kimwele, M. W. Feature selection for classi?cation using principal component analysis and information gain. Expert Systems with Applications vol. 174, pp. 114765, 2021.

Pereira, R. B., Carvalho, A. P. d., Zadrozny, B., and Merschmann, L. H. d. C. Information gain feature selection for multi-label classification., 2015.

Soares, G. C. SAM - uma abordagem específica de mineração de dados socioeconômicos de alunos do IF Amazonas para apoio ao processo de concessão de assistência estudantil. M.S. thesis, Universidade Federal de Pernambuco, 2020.

Viegas, F. R., Sandin, I., Salles, T., and Rocha, L. Seleção de atributos agressiva e efetiva usando programação genética. Revista Eletrônica de Iniciação Científica em Computação 12 (3), 2012.

Whang, S. E. and Lee, J.-G. Data collection and quality challenges for deep learning. Proceedings of the VLDB Endowment 13 (12): 3429?3432, 2020.

Zebari, R., Abdulazeez, A., Zeebaree, D., Zebari, D., and Saeed, J. A comprehensive review of dimensionality reduction techniques for feature selection and feature extraction. Journal of Applied Science and Technology Trends 1 (2): 56?70, 2020.
MELO, Eduardo; TULER, Elisa; ROCHA, Leonardo. Combining Data Mining Techniques to Analyse Factors Associated with Allocation of Socioeconomic Resources at IFMG. In: SYMPOSIUM ON KNOWLEDGE DISCOVERY, MINING AND LEARNING (KDMILE), 9. , 2021, Rio de Janeiro. Anais [...]. Porto Alegre: Sociedade Brasileira de Computação, 2021 . p. 89-96. ISSN 2763-8944. DOI: