Uncovering Potential Proteomic Biomarkers for Cancer Patients with COVID-19 Infection using Multilabel Deep Learning Model

  • Marcelo Benedeti Palermo Unisinos
  • Cristiano André da Costa Unisinos
  • Rodrigo da Rosa Righi Unisinos

Resumo


The effects of COVID-19 on cancer patients are concerning. This work proposes a framework that employs a multilabel classifier processing longitudinal proteomics patients’ data to identify potential proteomic biomarkers that correlate cancer and COVID-19. The framework uses Olink NPX data from 305 COVID-19-positive cancer patients. Stratified k-fold cross-validation addresses data imbalance. The overall average results show a Jaccard index of 88.79%, a hamming loss of 0.32%, a Wasserstein distance of 0.64%, and an area under the curve of 94.47%, across 312 labels, with four proteins presenting a Jaccard index of 97% or above, identified as proeminent biomarkers.

Referências

Bai, Y., Yang, E., Han, B., Yang, Y., Li, J., Mao, Y., Niu, G., and Liu, T. (2021). Understanding and improving early stopping for learning with noisy labels. Advances in Neural Information Processing Systems, 34:24392–24403.

Boberg, E., Kadri, N., Hagey, D. W., Schwieler, L., El Andaloussi, S., Erhardt, S., Iacobaeus, E., and Le Blanc, K. (2023). Cognitive impairments correlate with increased central nervous system immune activation after allogeneic haematopoietic stem cell transplantation. Leukemia, 37(4):888–900.

Consortium, U. (2019). Uniprot: a worldwide hub of protein knowledge. Nucleic acids research, 47(D1):D506–D515.

Dahal, A., Hong, Y., Mathew, J. S., Geber, A., Eckl, S., Renner, S., Sailer, C. J., Ryan, A. T., Mir, S., Lim, K., et al. (2024). Platelet-activating factor (paf) promotes immunosuppressive neutrophil differentiation within tumors. Proceedings of the National Academy of Sciences, 121(35):e2406748121.

Doknic, A. and Möller, T. (2025). Mlmc: Interactive multi-label multi-classifier evaluation without confusion matrices. arXiv preprint arXiv:2501.14460.

Esti Anggraini, R. N., Machmudah, H., and Sarno, R. (2023). Hierarchical topic mining and multi-label classification on online news in bahasa. In 2023 International Conference on Advanced Mechatronics, Intelligent Manufacture and Industrial Automation (ICAMIMIA), pages 1–6.

Filbin, M. R., Mehta, A., Schneider, A. M., Kays, K. R., Guess, J. R., Gentili, M., Fenyves, B. G., Charland, N. C., Gonye, A. L., Gushterova, I., et al. (2021). Longitudinal proteomic analysis of severe covid-19 reveals survival-associated signatures, tissue-specific cell death, and cell-cell interactions. Cell Reports Medicine, 2(5):100287.

Fung, M. and Babik, J. M. (2021). Covid-19 in immunocompromised hosts: what we know so far. Clinical Infectious Diseases, 72(2):340–350.

García-Pedrajas, N. E., Cuevas-Muñoz, J. M., Cerruela-García, G., and de Haro-García, A. (2024). A thorough experimental comparison of multilabel methods for classification performance. Pattern recognition, page 110342.

Hossain, M. A., Rahman, M. Z., Bhuiyan, T., and Moni, M. A. (2024). Identification of biomarkers and molecular pathways implicated in smoking and covid-19 associated lung cancer using bioinformatics and machine learning approaches. International Journal of Environmental Research and Public Health, 21(11):1392.

Kluger, D. M. and Owen, A. B. (2024). A central limit theorem for the benjaminihochberg false discovery proportion under a factor model. Bernoulli, 30(1):743–769.

Kocsmár, É., Kocsmár, I., Elamin, F., Pápai, L., Jakab, Á., Várkonyi, T., Glasz, T., Rácz, G., Pesti, A., Danics, K., et al. (2024). Autopsy findings in cancer patients infected with sars-cov-2 show a milder presentation of covid-19 compared to non-cancer patients. GeroScience, 46(6):6101–6114.

Li, B., Tang, X., Qi, X., Chen, Y., Li, C.-G., and Xiao, R. (2022). Emu: Effective multi-hot encoding net for lightweight scene text recognition with a large character set. IEEE Transactions on Circuits and Systems for Video Technology, 32(8):5374–5385.

Liew, F., Efstathiou, C., Fontanella, S., Richardson, M., Saunders, R., Swieboda, D., Sidhu, J. K., Ascough, S., Moore, S. C., Mohamed, N., et al. (2024). Large-scale phenotyping of patients with long covid post-hospitalization reveals mechanistic subtypes of disease. Nature immunology, 25(4):607–621.

Liu, K., Qin, Z., Ge, Y., Bian, A., Xu, X., Wu, B., Xing, C., and Mao, H. (2023). Acute kidney injury in advanced lung cancer patients treated with pd-1 inhibitors: a single center observational study. Journal of Cancer Research and Clinical Oncology, 149(8):5061–5070.

Lv, C., Guo, W., Yin, X., Liu, L., Huang, X., Li, S., and Zhang, L. (2024). Innovative applications of artificial intelligence during the covid-19 pandemic. Infectious Medicine, page 100095.

Patel, M. A., Knauer, M. J., Nicholson, M., Daley, M., Van Nynatten, L. R., Cepinskas, G., and Fraser, D. D. (2023). Organ and cell-specific biomarkers of long-covid identified with targeted proteomics and machine learning. Molecular Medicine, 29(1):26.

Skubitz, K. M. (2024). The role of ceacam s in neutrophil function. European Journal of Clinical Investigation, 54:e14349.

Szeghalmy, S. and Fazekas, A. (2023). A comparative study of the use of stratified cross-validation and distribution-balanced stratified cross-validation in imbalanced learning. Sensors, 23(4):2333.

Upadhyai, P., Shenoy, P. U., Banjan, B., Albeshr, M. F., Mahboob, S., Manzoor, I., and Das, R. (2022). Exome-wide association study reveals host genetic variants likely associated with the severity of covid-19 in patients of european ancestry. Life, 12(9):1300.

Wik, L., Nordberg, N., Broberg, J., Björkesten, J., Assarsson, E., Henriksson, S., Grundberg, I., Pettersson, E., Westerberg, C., Liljeroth, E., et al. (2021). Proximity extension assay in combination with next-generation sequencing for high-throughput proteome-wide analysis. Molecular & Cellular Proteomics, 20.

Xu, Q., Yang, Y., Zhang, X., and Cai, J. J. (2022). Association of pyroptosis and severeness of covid-19 as revealed by integrated single-cell transcriptome data analysis. ImmunoInformatics, 6:100013.

Yadalam, P. K., Arumuganainar, D., Natarajan, P. M., and Ardila, C. M. (2025). Predicting the hub interactome of covid-19 and oral squamous cell carcinoma: uncovering aldh-mediated wnt/β-catenin pathway activation via salivary inflammatory proteins. Scientific Reports, 15(1):4068.

Zhou, J., Lakhani, I., Chou, O., Leung, K. S. K., Lee, T. T. L., Wong, M. V., Li, Z., Wai, A. K. C., Chang, C., Wong, I. C. K., et al. (2023). Clinical characteristics, risk factors and outcomes of cancer patients with covid-19: A population-based study. Cancer Medicine, 12(1):287–296.
Publicado
09/06/2025
PALERMO, Marcelo Benedeti; COSTA, Cristiano André da; RIGHI, Rodrigo da Rosa. Uncovering Potential Proteomic Biomarkers for Cancer Patients with COVID-19 Infection using Multilabel Deep Learning Model. In: SIMPÓSIO BRASILEIRO DE COMPUTAÇÃO APLICADA À SAÚDE (SBCAS), 25. , 2025, Porto Alegre/RS. Anais [...]. Porto Alegre: Sociedade Brasileira de Computação, 2025 . p. 377-388. ISSN 2763-8952. DOI: https://doi.org/10.5753/sbcas.2025.7181.

Artigos mais lidos do(s) mesmo(s) autor(es)