Estimating Variable Importance and Interaction in Machine Learning via Genetic Algorithms
Resumo
This work explores the use of the Genetic Algorithm with Linkage Learning (GAwLL) for the feature selection problem. A notable byproduct of GAwLL application for feature selection is the generation of a variable interaction graph, offering insights into feature dependencies. To extend the original C++ implementation—which was limited to the K-nearest neighbors (KNN) algorithm—we introduce PyGAwLLfs, a Python-based framework designed to support feature selection for a wide range of machine learning models. Using PyGAwLLfs, we assess GAwLL performance across various models, including decision trees, random forests, and artificial neural networks. Furthermore, based on the internal mechanisms of GAwLL, we present a novel method for estimating variable importance in machine learning. Experimental evaluation on multiple datasets, including one from particle physics, demonstrates PyGAwLLfs effectiveness in the estimation of variable importance and interaction. We compare the variable importance scores generated by PyGAwLLfs with those produced by traditional approaches.
Publicado
29/09/2025
Como Citar
GUIMARÃES, David Gabriel; CARVALHO, Sabrina Sousa; NASCIMENTO, Renato Higor do; TINÓS, Renato.
Estimating Variable Importance and Interaction in Machine Learning via Genetic Algorithms. In: BRAZILIAN CONFERENCE ON INTELLIGENT SYSTEMS (BRACIS), 35. , 2025, Fortaleza/CE.
Anais [...].
Porto Alegre: Sociedade Brasileira de Computação,
2025
.
p. 478-492.
ISSN 2643-6264.
