A Novel Interpretable Approach to Deep Multimodal Data Fusion Applied to Cancer Diagnosis

  • Leandro M. de Lima UFES
  • Matheus B. Rocha UFES
  • Renato A. Krohling UFES

Resumo


The use of artificial intelligence (AI) in healthcare has seen significant growth, particularly in cancer diagnosis, which remains one of the leading causes of mortality globally. AI-based computer-aided diagnosis systems leveraging computer vision have explored multimodal data including images, text, and graphs, among others to improve diagnostic accuracy. However, many approaches, especially those involving middle fusion, are considered black-box models. Inspired by the clinical reasoning adopted by medical professionals, and aiming to develop a more interpretable approach based on deep multimodal fusion, this paper proposes a new hybrid approach. Firstly, the medical lesion image is classified using a convolutional neural network or transformer, generating a probability for one of the target classes. Next, this probability, combined with clinical and/or sociodemographic data, is used as input to an external classifier, which generates the final diagnostic. The proposed approach was evaluated: 1) on the skin cancer PAD-UFES-20 dataset consisting of clinical image and patient lesion information, and 2) on oral cancer NDB-UFES dataset consisting of histopathological images and sociodemographic data. On the one side, the obtained results indicate a slight inferior performance in terms of balanced accuracy compared to state-of-the-art using middle fusion, but on the other side our model provided interpretability using Shapley Additive Explanations (SHAP).
Publicado
29/09/2025
LIMA, Leandro M. de; ROCHA, Matheus B.; KROHLING, Renato A.. A Novel Interpretable Approach to Deep Multimodal Data Fusion Applied to Cancer Diagnosis. In: BRAZILIAN CONFERENCE ON INTELLIGENT SYSTEMS (BRACIS), 35. , 2025, Fortaleza/CE. Anais [...]. Porto Alegre: Sociedade Brasileira de Computação, 2025 . p. 35-49. ISSN 2643-6264.