Exploring the Use of Vision Transformer in the Classification of Anal and Cervical Lesions: A Systematic Review
Abstract
Although anal canal cancer (ACC) has a low incidence among gastrointestinal tract tumors, a significant increase in the number of cases has been observed, making it an emerging public health concern. This paper presents a systematic review of the use of the Vision Transformer architecture in classifying cervical and anal cells, evaluating its performance compared with other machine learning architectures. Cervical cells were included in the study due to their histological similarity to anais cells and the widespread use of the Pap test. The objective is to identify the state of the art in the diagnosis of cervical cancer through conventional cytology and to explore the feasibility of using transfer learning for application in anal cancer.References
Abinaya, K. and Sivakumar, B. (2024). A deep learning-based approach for cervical cancer classification using 3d cnn and vision transformer. Journal of Imaging Informatics in Medicine, 37:280–296.
AlMohimeed, A., Shehata, M., and et. al. (2024). Vit-pso-svm: Cervical cancer prediction based on integrating vision transformer with particle swarm optimization and support vector machine. Bioengineering, 11(7):729.
Emara, H. M., El-Shafai, W., and et. al. (2024). Cervical cancer detection: A comprehensive evaluation of cnn models, vision transformer approaches, and fusion strategies. IEEE Access.
Fang, M., Fu, M., Liao, B., and et. al. (2024). Deep integrated fusion of local and global features for cervical cell classification. Computers in Biology and Medicine, 171:108153.
Group, C. R. (2021). Cric searchable image database. Federal University of Minas Gerais. Available at: [link].
Hemalatha, K., Vetriselvi, V., Dhandapani, M., and Gladys, A. A. (2023). Cervixfuzzy-fusion for cervical cancer cell image classification. Biomedical Signal Processing and Control, 85:104920.
Jantzen, J., Dounias, G., and Norup, J. (2005). The herlev pap smear dataset. University of Denmark. Available at: [link].
Khowaja, A., Zou, B., and Kui, X. (2024). Enhancing cervical cancer diagnosis: Integrated attention-transformer system with weakly supervised learning. Image and Vision Computing, 149:105193.
Khowaja, A., Zou, B., and Xiaoyan, K. (2023). Cervix visionator elm: A novel approach to early detection of cervical cancer.
Li, M., Que, N., Zhang, J., Du, P., and Dai, Y. (2024). Vtcnet: A feature fusion dl model based on cnn and vit for the classification of cervical cells. International Journal of Imaging Systems and Technology, 34(5).
Liu, W., Li, C., Xu, N., and et. al. (2022). Cvm-cervix: A hybrid cervical pap-smear image classification framework using cnn, visual transformer, and multilayer perceptron. Pattern Recognition, 130:108829.
Maurya, R., Pandey, N. N., and Dutta, M. K. (2023). Visioncervix: Papanicolaou cervical smears classification using novel cnn-vision ensemble approach. Biomedical Signal Processing and Control, 79(Part 2):104156.
Pacal, I. (2024). Maxcervixt: A novel lightweight vision transformer-based approach for precise cervical cancer detection. Knowledge-Based Systems, 289:111482.
Plissiti, M. E., Nikou, C., and Charchanti, A. (2020). Sipakmed: A dataset for classification and detection of cervical cancer cells. IEEE Journal of Biomedical and Health Informatics, 24(4):1175–1185.
Robb, B. W. and Mutch, M. G. (2006). Epidermoid carcinoma of the anal canal. Clinics in Colon and Rectal Surgery, 19(2):108–115.
Sholik, M., Fatichah, C., and Amaliah, B. (2024). Deep feature extraction of pap smear images based on convolutional neural network and vision transformer for cervical cancer classification. In 2024 IEEE International Conference on Industry 4.0, Artificial Intelligence, and Communications Technology (IAICT), pages 290–296, Bali, Indonesia.
Stewart, D. B., Gaertner, W. B., Glasgow, S. C., Herzig, D. O., Feingold, D., and Steele, S. R. (2018). The american society of colon and rectal surgeons clinical practice guidelines for anal squamous cell cancers (revised 2018). Diseases of the Colon & Rectum, 61(7):755–774. Prepared on behalf of the Clinical Practice Guidelines Committee of the American Society of Colon and Rectal Surgeons.
AlMohimeed, A., Shehata, M., and et. al. (2024). Vit-pso-svm: Cervical cancer prediction based on integrating vision transformer with particle swarm optimization and support vector machine. Bioengineering, 11(7):729.
Emara, H. M., El-Shafai, W., and et. al. (2024). Cervical cancer detection: A comprehensive evaluation of cnn models, vision transformer approaches, and fusion strategies. IEEE Access.
Fang, M., Fu, M., Liao, B., and et. al. (2024). Deep integrated fusion of local and global features for cervical cell classification. Computers in Biology and Medicine, 171:108153.
Group, C. R. (2021). Cric searchable image database. Federal University of Minas Gerais. Available at: [link].
Hemalatha, K., Vetriselvi, V., Dhandapani, M., and Gladys, A. A. (2023). Cervixfuzzy-fusion for cervical cancer cell image classification. Biomedical Signal Processing and Control, 85:104920.
Jantzen, J., Dounias, G., and Norup, J. (2005). The herlev pap smear dataset. University of Denmark. Available at: [link].
Khowaja, A., Zou, B., and Kui, X. (2024). Enhancing cervical cancer diagnosis: Integrated attention-transformer system with weakly supervised learning. Image and Vision Computing, 149:105193.
Khowaja, A., Zou, B., and Xiaoyan, K. (2023). Cervix visionator elm: A novel approach to early detection of cervical cancer.
Li, M., Que, N., Zhang, J., Du, P., and Dai, Y. (2024). Vtcnet: A feature fusion dl model based on cnn and vit for the classification of cervical cells. International Journal of Imaging Systems and Technology, 34(5).
Liu, W., Li, C., Xu, N., and et. al. (2022). Cvm-cervix: A hybrid cervical pap-smear image classification framework using cnn, visual transformer, and multilayer perceptron. Pattern Recognition, 130:108829.
Maurya, R., Pandey, N. N., and Dutta, M. K. (2023). Visioncervix: Papanicolaou cervical smears classification using novel cnn-vision ensemble approach. Biomedical Signal Processing and Control, 79(Part 2):104156.
Pacal, I. (2024). Maxcervixt: A novel lightweight vision transformer-based approach for precise cervical cancer detection. Knowledge-Based Systems, 289:111482.
Plissiti, M. E., Nikou, C., and Charchanti, A. (2020). Sipakmed: A dataset for classification and detection of cervical cancer cells. IEEE Journal of Biomedical and Health Informatics, 24(4):1175–1185.
Robb, B. W. and Mutch, M. G. (2006). Epidermoid carcinoma of the anal canal. Clinics in Colon and Rectal Surgery, 19(2):108–115.
Sholik, M., Fatichah, C., and Amaliah, B. (2024). Deep feature extraction of pap smear images based on convolutional neural network and vision transformer for cervical cancer classification. In 2024 IEEE International Conference on Industry 4.0, Artificial Intelligence, and Communications Technology (IAICT), pages 290–296, Bali, Indonesia.
Stewart, D. B., Gaertner, W. B., Glasgow, S. C., Herzig, D. O., Feingold, D., and Steele, S. R. (2018). The american society of colon and rectal surgeons clinical practice guidelines for anal squamous cell cancers (revised 2018). Diseases of the Colon & Rectum, 61(7):755–774. Prepared on behalf of the Clinical Practice Guidelines Committee of the American Society of Colon and Rectal Surgeons.
Published
2025-06-09
How to Cite
BROMERSCHENCKEL, Ingrid; CAMPOS, Andrea Gomes.
Exploring the Use of Vision Transformer in the Classification of Anal and Cervical Lesions: A Systematic Review. In: BRAZILIAN SYMPOSIUM ON COMPUTING APPLIED TO HEALTH (SBCAS), 25. , 2025, Porto Alegre/RS.
Anais [...].
Porto Alegre: Sociedade Brasileira de Computação,
2025
.
p. 1050-1056.
ISSN 2763-8952.
DOI: https://doi.org/10.5753/sbcas.2025.7874.
