Applying Decision Trees to Gene Expression Data from DNA Microarrays: A Leukemia Case Study

  • Oscar Picchi Netto USP
  • Sérgio Ricardo Nozawa Centro Universitário Nilton Lins
  • Rafael Andrés Rosales Mitrowsky USP
  • Alessandra Alaniz Macedo USP
  • José Augusto Baranauskas USP


Analyzing gene expression data is a challenging task since the large number of features against the shortage of available examples can be prone to overfitting. In order to avoid this pitfall and achieve high performance, some approaches construct complex classifiers, using new or well-established strategies. The main objective of this communication is to construct classifiers that can be human readable as well as robust in performance in microarray data using decision trees. Using one well-known leukemia dataset, a publicly available gene expression classification problem, we show the feasibility of decision trees on microarray data. Summarizing our results, we have obtained simple decision trees with performance comparable to related work.


NETTO, Oscar Picchi; NOZAWA, Sérgio Ricardo; MITROWSKY, Rafael Andrés Rosales; MACEDO, Alessandra Alaniz; BARANAUSKAS, José Augusto. Applying Decision Trees to Gene Expression Data from DNA Microarrays: A Leukemia Case Study. In: SIMPÓSIO BRASILEIRO DE COMPUTAÇÃO APLICADA À SAÚDE (SBCAS), 10. , 2010, Belo Horizonte/MG. Anais [...]. Porto Alegre: Sociedade Brasileira de Computação, 2010 . p. 1489-1498. ISSN 2763-8952.

