Optimized Neural Networks for Breast Cancer Classification Using Gene Expression Data

  • Ana Beatriz Miranda Valentin UTFPR
  • Glaucia Maria Bressan UTFPR
  • Leonardo Canuto Junior UTFPR
  • Elisângela Ap. da Silva Lizzi UTFPR

Abstract


This study aims to develop and evaluate optimized neural networks, including Multilayer Perceptrons (MLP) and Convolutional Neural Networks (CNN), by employing deep learning techniques to classify breast cancer subtypes, based on gene expression data. By implementing different neural network architectures and optimization strategies, this research seeks to determine the accuracy and efficiency of these classification methods. Data is sourced from The Cancer Genome Atlas (TCGA) repository and undergoes preprocessing, including dimensionality reduction, to prepare it for analysis. The contribution is to enhance diagnostic tools, as well as assess the predictive performance of the approaches. The comparison of networks performance presents a promising pathway to enhancing the precision of medical diagnostics and personalize treatment strategies in breast cancer.

References

Aggarwal, C. C. (2015). Data Classification. CRC Press, Florida - USA.

Akiba, T., Sano, S., Yanase, T., Ohta, T., and Koyama, M. (2019). Optuna: A next-generation hyperparameter optimization framework. CoRR, abs/1907.10902.

Bishop, C. M. (2006). Pattern recognition and machine learning, volume 2. Springer.

Chollet, F. (2021). Deep Learning with Python. New York: Manning Publication, second edition.

Dablain, D., Jacobson, K. N., Bellinger, C., Roberts, M., and Chawla, N. V. (2024). Understanding cnn fragility when learning with imbalanced data. Machine Learning, 113(7):4785–4810.

Iparraguirre-Villanueva, O., Epifanía-Huerta, A., Torres-Ceclén, C., Ruiz-Alvarado, J., and Cabanillas-Carbonell, M. (2023). Breast cancer prediction using machine learning models. International Journal of Advanced Computer Science and Applications, 14(2):610–620.

Johnson, R. A. and Wichern, D. W. (2007). Applied Multivariate Statistical Analysis, volume 6. Pearson Education.

Kreyszig, E. (2010). Advanced Engineering Mathematics. John Wiley & Sons, New York, 10 edition.

LeCun, Y., Bengio, Y., et al. (1995). Convolutional networks for images, speech, and time series. The handbook of brain theory and neural networks, 3361(10):1995.

LeCun, Y., Bottou, L., Bengio, Y., and Haffner, P. (1998). Gradient-based learning applied to document recognition. Proceedings of the IEEE, 86(11):2278–2324.

Lopez-Garcia, G., Jerez, J. M., Franco, L., and Veredas, F. J. (2020). Transfer learning with convolutional neural networks for cancer survival prediction using gene-expression data. PloS one, 15(3):e0230536.

Ministério da Saúde (2024). Câncer de mama. [link]. Accessed: 2024-08-26.

Rabiei, R., M., A. S., Sohrabei, S., Esmaeili, M., and Atashi, A. (2022). Prediction of breast cancer using machine learning approaches. J Biomed Phys Eng, 12(3):297–308.

Rencher, A. (2002). Methods of Multivariate Analysis. Wiley series in probability and mathematical statistics. Wiley.

Sait, A. R. W. and Nagaraj, R. (2024). An enhanced lightgbm-based breast cancer detection technique using mammography images. Diagnostics, 14(2):227.

Tewari, Y., Ujjwal, E., and Kumar, L. (2022). Breast cancer classification using machine learning. In 2022 2nd International Conference on Advance Computing and Innovative Technologies in Engineering (ICACITE).

Timm, N. (2002). Applied Multivariate Analysis. Springer Texts in Statistics. Springer New York.

Turgut, S., Dağtekin, M., and Ensari, T. (2018). Microarray breast cancer data classification using machine learning methods. In 2018 Electric Electronics, Computer Science, Biomedical Engineerings’ Meeting (EBBT).

World Health Organization (2023). Who launches new roadmap on breast cancer. [link]. Accessed: 2024-08-26.

Wu, J., Chen, X.-Y., Zhang, H., Xiong, L.-D., Lei, H., and Deng, S.-H. (2019). Hyperparameter optimization for machine learning models based on bayesian optimizationb. Journal of Electronic Science and Technology, 17(1):26–40.

Wu, J. and Hicks, C. (2021). Breast cancer type classification using machine learning. Journal of Personalized Medicine, 11(61).

Yang, L. and Shami, A. (2020). On hyperparameter optimization of machine learning algorithms: Theory and practice. Neurocomputing, 415:295–316.
Published
2024-12-02
VALENTIN, Ana Beatriz Miranda; BRESSAN, Glaucia Maria; CANUTO JUNIOR, Leonardo; LIZZI, Elisângela Ap. da Silva. Optimized Neural Networks for Breast Cancer Classification Using Gene Expression Data. In: BRAZILIAN SYMPOSIUM ON BIOINFORMATICS (BSB), 17. , 2024, Vitória/ES. Anais [...]. Porto Alegre: Sociedade Brasileira de Computação, 2024 . p. 36-46. ISSN 2316-1248. DOI: https://doi.org/10.5753/bsb.2024.245194.