Exploring explainable machine learning to predict genetic factors associated with survival in renal cell carcinoma using transcriptomic data
Resumo
The integration of machine learning (ML) with transcriptomic data offers a promising path for advancing precision oncology by improving predictive performance and identifying clinically relevant molecular signatures. This study analyzes gene signatures associated with survival in Renal Cell Carcinoma subtypes (KIRC and KIRP) using TCGA data, combining predictive modeling with three Explainable AI techniques (SHAP, permutation importance, and surrogate decision trees). Among five evaluated algorithms, CatBoost achieved the best performance (AUC-ROC: 0.82 for KIRC). The XAI analysis identified both established and novel biomarker candidates related to therapeutic resistance and inflammatory pathways. These results highlight the potential of explainable AI to bridge predictive accuracy and biological interpretability.Referências
Albertson, D., Barry, M., Liu, T., Mahlow, J., and Sirohi, D. (2025). Genotype phenotype correlation of renal tumors in the cancer genome atlas database. International Journal of Surgical Pathology, 33(2):289–301.
Feng, H. and Shen, W. (2020). ACAA1 is a predictive factor of survival and is correlated with T cell infiltration in non-small cell lung cancer. Frontiers in Oncology, 10:564796.
Kang, L., Wang, D., Shen, T., Liu, X., Dai, B., Zhou, D., Shen, H., Gong, J., Li, G., Hu, Y., et al. (2023). PDIA4 confers resistance to ferroptosis via induction of ATF4/SLC7A11 in renal cell carcinoma. Cell death & disease, 14(3):193.
Kolenda, T., Poter, P., Guglas, K., Kozłowska-Masłoń, J., Braska, A., Kazimierczak, U., and Teresiak, A. (2023). Biological role and diagnostic utility of ribosomal protein L23a pseudogene 53 in cutaneous melanoma. Reports of Practical Oncology and Radiotherapy, 28(2):255–270.
Peduzzi, G., Felici, A., Pellungrini, R., and Campa, D. (2025). Explainable machine learning identifies a polygenic risk score as a key predictor of pancreatic cancer risk in the uk biobank. Digestive and Liver Disease, 57(4):915–922.
Seo, H., Park, J.-H., Lee, J., and Chung, B. C. (2025). Explainable ai based feature selection in cancer RNA-seq. ICT Express.
Shah, H. H. and Lodhi, S. K. (2025). AI in personalized medicine: Tailoring treatment plans based on individual patient data. Global Trends in Science and Technology, 1(1):15–29.
Sharma, V., Davies, A., and Ainsworth, J. (2021). Clinical risk prediction models: the canary in the coalmine for artificial intelligence in healthcare? BMJ Health & Care Informatics, 28(1):e100421.
Su, L., Hounye, A. H., Pan, Q., Miao, K., Wang, J., Hou, M., and Xiong, L. (2024). Explainable cancer factors discovery: Shapley additive explanation for machine learning models demonstrates the best practices in the case of pancreatic cancer. Pancreatology, 24(3):404–423.
Tabibu, S., Vinod, P., and Jawahar, C. (2019). Pan-renal cell carcinoma classification and survival prediction from histopathology images using deep learning. Scientific reports, 9(1):10509.
Tran, D., Nguyen, H., Pham, V.-D., Nguyen, P., Nguyen Luu, H., Minh Phan, L., Blair DeStefano, C., Jim Yeung, S.-C., and Nguyen, T. (2025). A comprehensive review of cancer survival prediction using multi-omics integration and clinical variables. Briefings in Bioinformatics, 26(2):bbaf150.
Wang, Q., Liu, J., Li, R., Wang, S., Xu, Y., Wang, Y., Zhang, H., Zhou, Y., Zhang, X., Chen, X., et al. (2024). Assessing the role of programmed cell death signatures and related gene TOP2A in progression and prognostic prediction of clear cell renal cell carcinoma. Cancer Cell International, 24(1):164.
Zhang, H., Liu, Y., Wang, B., and Wang, C. (2022). Interleukin 20 receptor subunit beta (il20rb) predicts poor prognosis and regulates immune cell infiltration in clear cell renal cell carcinoma. BMC Genomic Data, 23(1):58.
Zhang, L., Wu, X., Fan, X., and Ai, H. (2023). MUM1L1 as a tumor suppressor and potential biomarker in ovarian cancer: evidence from bioinformatics analysis and basic experiments. Combinatorial Chemistry & High Throughput Screening, 26(14):2487–2501.
Zhou, K., Li, Y., Wang, W., Chen, Y., Qian, B., Liang, Y., Li, H., Xu, R., and Zhuang, L. (2025). SLFN11: a pan-cancer biomarker for DNA-targeted drugs sensitivity and therapeutic strategy guidance. Frontiers in Oncology, 15:1582738.
Zhou, Y., Zheng, X., Xu, B., Hu, W., Huang, T., and Jiang, J. (2019). The identification and analysis of mRNA–lncRNA–miRNA cliques from the integrative network of ovarian cancer. Frontiers in genetics, 10:751.
Feng, H. and Shen, W. (2020). ACAA1 is a predictive factor of survival and is correlated with T cell infiltration in non-small cell lung cancer. Frontiers in Oncology, 10:564796.
Kang, L., Wang, D., Shen, T., Liu, X., Dai, B., Zhou, D., Shen, H., Gong, J., Li, G., Hu, Y., et al. (2023). PDIA4 confers resistance to ferroptosis via induction of ATF4/SLC7A11 in renal cell carcinoma. Cell death & disease, 14(3):193.
Kolenda, T., Poter, P., Guglas, K., Kozłowska-Masłoń, J., Braska, A., Kazimierczak, U., and Teresiak, A. (2023). Biological role and diagnostic utility of ribosomal protein L23a pseudogene 53 in cutaneous melanoma. Reports of Practical Oncology and Radiotherapy, 28(2):255–270.
Peduzzi, G., Felici, A., Pellungrini, R., and Campa, D. (2025). Explainable machine learning identifies a polygenic risk score as a key predictor of pancreatic cancer risk in the uk biobank. Digestive and Liver Disease, 57(4):915–922.
Seo, H., Park, J.-H., Lee, J., and Chung, B. C. (2025). Explainable ai based feature selection in cancer RNA-seq. ICT Express.
Shah, H. H. and Lodhi, S. K. (2025). AI in personalized medicine: Tailoring treatment plans based on individual patient data. Global Trends in Science and Technology, 1(1):15–29.
Sharma, V., Davies, A., and Ainsworth, J. (2021). Clinical risk prediction models: the canary in the coalmine for artificial intelligence in healthcare? BMJ Health & Care Informatics, 28(1):e100421.
Su, L., Hounye, A. H., Pan, Q., Miao, K., Wang, J., Hou, M., and Xiong, L. (2024). Explainable cancer factors discovery: Shapley additive explanation for machine learning models demonstrates the best practices in the case of pancreatic cancer. Pancreatology, 24(3):404–423.
Tabibu, S., Vinod, P., and Jawahar, C. (2019). Pan-renal cell carcinoma classification and survival prediction from histopathology images using deep learning. Scientific reports, 9(1):10509.
Tran, D., Nguyen, H., Pham, V.-D., Nguyen, P., Nguyen Luu, H., Minh Phan, L., Blair DeStefano, C., Jim Yeung, S.-C., and Nguyen, T. (2025). A comprehensive review of cancer survival prediction using multi-omics integration and clinical variables. Briefings in Bioinformatics, 26(2):bbaf150.
Wang, Q., Liu, J., Li, R., Wang, S., Xu, Y., Wang, Y., Zhang, H., Zhou, Y., Zhang, X., Chen, X., et al. (2024). Assessing the role of programmed cell death signatures and related gene TOP2A in progression and prognostic prediction of clear cell renal cell carcinoma. Cancer Cell International, 24(1):164.
Zhang, H., Liu, Y., Wang, B., and Wang, C. (2022). Interleukin 20 receptor subunit beta (il20rb) predicts poor prognosis and regulates immune cell infiltration in clear cell renal cell carcinoma. BMC Genomic Data, 23(1):58.
Zhang, L., Wu, X., Fan, X., and Ai, H. (2023). MUM1L1 as a tumor suppressor and potential biomarker in ovarian cancer: evidence from bioinformatics analysis and basic experiments. Combinatorial Chemistry & High Throughput Screening, 26(14):2487–2501.
Zhou, K., Li, Y., Wang, W., Chen, Y., Qian, B., Liang, Y., Li, H., Xu, R., and Zhuang, L. (2025). SLFN11: a pan-cancer biomarker for DNA-targeted drugs sensitivity and therapeutic strategy guidance. Frontiers in Oncology, 15:1582738.
Zhou, Y., Zheng, X., Xu, B., Hu, W., Huang, T., and Jiang, J. (2019). The identification and analysis of mRNA–lncRNA–miRNA cliques from the integrative network of ovarian cancer. Frontiers in genetics, 10:751.
Publicado
01/06/2026
Como Citar
FEIJÓ, Grace dos Santos; RECAMONDE-MENDOZA, Mariana.
Exploring explainable machine learning to predict genetic factors associated with survival in renal cell carcinoma using transcriptomic data. In: SIMPÓSIO BRASILEIRO DE COMPUTAÇÃO APLICADA À SAÚDE (SBCAS), 26. , 2026, Ouro Preto/MG.
Anais [...].
Porto Alegre: Sociedade Brasileira de Computação,
2026
.
p. 1499-1504.
ISSN 2763-8952.
DOI: https://doi.org/10.5753/sbcas.2026.21697.
