BioSpectralFormer: A Transformer-Based Architecture for FTIR Spectra Classification in Oral Cancer Diagnosis
Resumo
Fourier-transform infrared (FTIR) spectroscopy is a promising non-invasive technique for oral cancer diagnosis, whose high mortality is largely attributable to late-stage diagnosis. This work proposes BioSpectralFormer (BSF), a Transformer-based architecture with two attention types for classification of salivary FTIR spectra. Evaluated on real spectra data under stratified 10-fold cross-validation against seven baselines, BSF achieved competitive balanced accuracy (Mean(SE,SP) = 0.67±0.15) with high sensitivity (0.82±0.20), operating in the same statistical tier as state-of-the-art methods. Moreover, attention map analysis corroborated established oral cancer biomarkers, including Amide I and lipid C-H stretching, suggesting biological feature learning.
Referências
Boelens, H. F. M. et al. (2005). New background correction method for liquid chromatography with diode array detection, infrared spectroscopic detection and raman spectroscopic detection. Journal of Chromatography A, 1057:21–30.
Caixeta, D. C. et al. (2023). Salivary atr-ftir spectroscopy coupled with support vector machine classification for screening of type 2 diabetes mellitus. Diagnostics, 13:1396.
Chen, T. and Guestrin, C. (2016). Xgboost: A scalable tree boosting system. In Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, page 785–794. ACM.
Chen, X. et al. (2024). Patch-based self-attention for molecular structure prediction from infrared spectroscopy. Journal of Physical Chemistry A, 128:5665–5678.
Dai, H. et al. (2024). Vision transformers for materials identification using x-ray diffraction and infrared spectroscopy. Digital Discovery, 3:234–245.
Ding, M., Xiao, B., Codella, N., Luo, P., Wang, J., and Yuan, L. (2022). DaViT: Dual attention vision transformers. In Computer Vision – ECCV 2022, pages 74–92.
Dosovitskiy, A. et al. (2021). An image is worth 16x16 words: Transformers for image recognition at scale. In ICLR.
Filho, A. M., Fernandes, J., Sabino-Silva, R., and Carneiro, M. (2023). Ocanspectra: an oral cancer detection system from salivary atr-ftir spectroscopy. In Anais do XX Encontro Nacional de Inteligência Artificial e Computacional, pages 984–996. SBC.
Gorishniy, Y., Kotelnikov, A., and Babenko, A. (2025). Tabm: Advancing tabular deep learning with parameter-efficient ensembling. In International Conference on Learning Representations, volume 2025, pages 77899–77935.
Hollmann, N. et al. (2022). Tabpfn: A transformer that solves small tabular classification problems in a second. arXiv preprint arXiv:2207.01848.
Holzmüller, D., Grinsztajn, L., and Steinwart, I. (2024). Better by default: Strong pretuned mlps and boosted trees on tabular data.
Huang, S., Jin, Y., Jin, W., and Mu, Y. (2025). Analytical-chemistry-informed transformer for infrared spectra modeling. In Proceedings of the AAAI Conference on Artificial Intelligence, volume 39, pages 17440–17448.
Ke, G., Meng, Q., Finley, T., Wang, T., Chen, W., Ma, W., Ye, Q., and Liu, T.-Y. (2017). Lightgbm: A highly efficient gradient boosting decision tree. In Advances in Neural Information Processing Systems, volume 30.
Khosla, P. et al. (2020). Supervised contrastive learning. In NeurIPS.
Leng, P. et al. (2023). Cnn-lstm neural network for ftir spectroscopy-based cancer detection. Analytical and Bioanalytical Chemistry, 415:3891–3901.
Lima Filho, R. B., Fernandes, J. M., Ji, D., Zhao, L., Sabino-Silva, R., and Carneiro, M. G. (2024). High-level network-based detection of oral cancer from atr-ftir spectroscopy. In 2024 International Joint Conference on Neural Networks (IJCNN), pages 1–8.
Prokhorenkova, L., Gusev, G., Vorobev, A., Dorogush, A. V., and Gulin, A. (2018). Cat-boost: unbiased boosting with categorical features. In Advances in Neural Information Processing Systems, volume 31. Curran Associates, Inc.
Rivera, C. (2015). Essentials of oral cancer. International Journal of Clinical and Experimental Pathology, 8(9):11884–11894.
Santos, A. P., Filho, A. C. M., Sabino-Silva, R., and Carneiro, M. G. (2023). Convolutional neural networks for the molecular detection of covid-19. In Naldi, M. C. and Bianchi, R. A. C., editors, Intelligent Systems, pages 51–62, Cham. Springer Nature.
Santos, M. C. D. et al. (2020). Atr-ftir spectroscopy with chemometric algorithms of multivariate classification in the discrimination between healthy vs. dystrophic mammary tissues. Analytical Methods, 12:1385–1396.
Shree, P., Aggarwal, Y., Kumar, M., Majhee, L., Singh, N. N., Prakash, O., Chandra, A., Mahuli, S. A., Shamsi, S., and Rai, A. (2024). Saliva based diagnostic prediction of oral squamous cell carcinoma using FTIR spectroscopy. Indian J Otolaryngol Head Neck Surg, 76(3):2282–2289.
Su, K.-Y. and Lee, W.-L. (2020). Fourier transform infrared spectroscopy as a cancer screening and diagnostic tool: A review and prospects. Cancers, 12(1).
Swaminathan, D., George, N. A., Thomas, S., and Iype, E. M. (2024). Factors associated with delay in diagnosis of oral cancers. Cancer Treat. Res. Commun, 40:100831.
Touvron, H., Cord, M., Douze, M., Massa, F., Sablayrolles, A., and Jégou, H. (2021). Training data-efficient image transformers & distillation through attention. In ICML.
Wang, F. et al. (2017). Residual attention network for image classification. In CVPR.
Wen, Y., Zhang, K., Li, Z., and Qiao, Y. (2016). A discriminative feature learning approach for deep face recognition. In ECCV.
Wu, J., Chen, H., Liu, Y., Yang, R., and An, N. (2025). The global, regional, and national burden of oral cancer, 1990-2021: a systematic analysis for the global burden of disease study 2021. J. Cancer Res. Clin. Oncol, 151(2):53.
Yang, L. et al. (2022). Deep learning-based fourier transform infrared spectroscopy for identifying esophageal squamous cell carcinoma. Spectrochimica Acta Part A, 271:120891.
Zhang, Y. et al. (2024). Fcg-former: Functional group identification in ftir spectra using transformers. Analytical Chemistry, 96:7890–7899.
