Psoriasis Detection Using Computer Vision: A Comparative Approach Between CNNs and Vision Transformers
Abstract
This paper presents a comparison of the performance of Convolutional Neural Networks (CNNs) and Vision Transformers (ViTs) in the task of multiclassifying images containing lesions of psoriasis and diseases similar to it. Models pre-trained on ImageNet were adapted to a specific data set. Both achieved high predictive metrics, but the ViTs stood out for their superior performance with smaller models. Dual Attention Vision Transformer-Base (DaViT-B) obtained the best results, with an f1-score of 96.4%, and is recommended as the most efficient architecture for automated psoriasis detection. This article reinforces the potential of ViTs for medical image classification tasks.References
Alexey, D. (2020). An image is worth 16x16 words: Transformers for image recognition at scale. arXiv preprint arXiv: 2010.11929.
Atlas Dermatologico (2024). Dermatology atlas. [link]. Acesso em: 19/04/2024.
Bin Ji, Yiyi Wang, D. Z. (2022). Automatic detection and evaluation of nail psoriasis based on deep learning: a preliminary application and exploration. SPIE International Conference on Computer Application and Information Security.
Brasil, A. (2020). Estudo mostra que mais de 90% da população desconhecem a psoríase. [link]. Acesso em: 29/04/2024.
Chollet, F. (2017). Xception: Deep learning with depthwise separable convolutions. In Proceedings of the IEEE conference on computer vision and pattern recognition, pages 1251–1258.
Danderm (2017). Atlas of dermatology. [link]. Acesso em: 17/04/2024.
Dash, M. et al. (2020). A cascaded deep convolution neural network based cadx system for psoriasis lesion segmentation and severity assessment. Applied Soft Computing.
de Dermatologia, S. B. (2023). O que é a psoríase? [link]. Acesso em: 29/04/2024.
Deng, J., Dong, W., Socher, R., Li, L.-J., Li, K., and Fei-Fei, L. (2009). Imagenet: A large-scale hierarchical image database. In 2009 IEEE conference on computer vision and pattern recognition, pages 248–255. Ieee.
Dermatoweb (2002). Web docente de dermatologia. [link]. Acesso em: 22/04/2024.
DermIS (2022). Dermatology information system. [link]. Acesso em: 26/04/2024.
DermNetNZ (2024). The worlds leading free dermatology website. [link]. Acesso em: 29/04/2024.
Ding, M., Xiao, B., Codella, N., Luo, P., Wang, J., and Yuan, L. (2022). Davit: Dual attention vision transformers. In European conference on computer vision, pages 74–92. Springer.
Dozat, T. (2016). Incorporating nesterov momentum into adam. In Proceedings of the 4th International Conference on Learning Representations, Workshop Track, pages 1–4.
Dubey, A., Jauhri, A., Pandey, A., Kadian, A., Al-Dahle, A., Letman, A., Mathur, A., Schelten, A., Yang, A., Fan, A., et al. (2024). The llama 3 herd of models. arXiv preprint arXiv:2407.21783.
Géron, A. (2019). Mãos À Obra: Aprendizado de Máquina com Scikit-Learn E TensorFlow. Alta Books.
He, K., Zhang, X., Ren, S., and Sun, J. (2016). Deep residual learning for image recognition. In Proceedings of the IEEE conference on computer vision and pattern recognition, pages 770–778.
Hellenic Dermatological Atlas (2011). For health professionals and public. [link]. Acesso em: 16/04/2024.
Huang, G., Liu, Z., Van Der Maaten, L., and Weinberger, K. Q. (2017). Densely connected convolutional networks. In Proceedings of the IEEE conference on computer vision and pattern recognition, pages 4700–4708.
Hurst, A., Lerer, A., Goucher, A. P., Perelman, A., Ramesh, A., Clark, A., Ostrow, A., Welihinda, A., Hayes, A., Radford, A., et al. (2024). Gpt-4o system card. arXiv preprint arXiv:2410.21276.
Huynh, N. Q., Xu, X., Kong, A. W. K., and Subbiah, S. (2014). A preliminary report on a full-body imaging system for effectively collecting and processing biometric traits of prisoners. In 2014 IEEE Symposium on Computational Intelligence in Biometrics and Identity Management (CIBIM).
Kingma, D. and Ba, J. (2014). Adam: A method for stochastic optimization. International Conference on Learning Representations.
Kohavi, R. (1995). A study of cross-validation and bootstrap for accuracy estimation and model selection. In Proceedings of the 14th International Joint Conference on Artificial Intelligence - Volume 2. Morgan Kaufmann Publishers Inc.
LeCun, Y., Denker, J., and Solla, S. (1989). Optimal brain damage. Advances in neural information processing systems, 2.
Liu, Z., Lin, Y., Cao, Y., Hu, H., Wei, Y., Zhang, Z., Lin, S., and Guo, B. (2021). Swin transformer: Hierarchical vision transformer using shifted windows. In Proceedings of the IEEE/CVF international conference on computer vision, pages 10012–10022.
Liu, Z., Mao, H., Wu, C.-Y., Feichtenhofer, C., Darrell, T., and Xie, S. (2022). A convnet for the 2020s. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pages 11976–11986.
Luque, A., Carrasco, A., Martín, A., and de Las Heras, A. (2019). The impact of class imbalance in classification performance metrics based on the binary confusion matrix. Pattern Recognition, 91:216–231.
Maduranga, P. and Nandasena, D. (2022). Mobile-based skin disease diagnosis system using convolutional neural networks (cnn). International Journal of Image, Graphics and Signal Processing, 14:47–57.
Marcel, S. and Rodriguez, Y. (2010). Torchvision the machine-vision package of torch. [link].
Meng, X. (2013). Scalable simple random sampling and stratified sampling. In Dasgupta, S. and McAllester, D., editors, Proceedings of the 30th International Conference on Machine Learning, volume 28 of Proceedings of Machine Learning Research, pages 531–539, Atlanta, Georgia, USA. PMLR.
Milani, A. et al. (2023). A deep learning application for psoriasis detection. Anais do Encontro Nacional de Inteligência Artificial e Computacional (ENIAC), pages 315–329.
Mohan, J., Sivasubramanian, A., Sowmya, V., and Vinayakumar, R. (2024). Enhancing skin disease classification leveraging transformer-based deep learning architectures and explainable ai. arXiv preprint arXiv:2407.14757.
Olescki, G. (2021). Detecção de tromboembolia pulmonar utilizando redes neurais convolucionais e extração de características. Anais do XXI Simpósio Brasileiro de Computação Aplicada à Saúde, pages 381–391.
Oquab, M., Darcet, T., Moutakanni, T., Vo, H., Szafraniec, M., Khalidov, V., Fernandez, P., Haziza, D., Massa, F., El-Nouby, A., et al. (2023). Dinov2: Learning robust visual features without supervision. arXiv preprint arXiv:2304.07193.
Rodrigues, A. P. and Teixeira, R. M. (2009). Desvendando a psoríase. RBAC.
Roslan, R., Razly, I., Sabri, B., and Ibrahim, Z. (2020). Evaluation of psoriasis skin disease classification using convolutional neural network. IAES International Journal of Artificial Intelligence (IJ-AI), 9:349.
Silva, G. et al. (2022). Cardiac arrhythmia detection in ecg signals using graph convolutional network. Anais do XXII Simpósio Brasileiro de Computação Aplicada à Saúde, pages 25–35.
Simonyan, K. (2014). Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556.
Szegedy, C., Vanhoucke, V., Ioffe, S., Shlens, J., and Wojna, Z. (2016). Rethinking the inception architecture for computer vision. In Proceedings of the IEEE conference on computer vision and pattern recognition, pages 2818–2826.
Tan, M. and Le, Q. (2021). Efficientnetv2: Smaller models and faster training. In International conference on machine learning, pages 10096–10106. PMLR.
Team, G., Georgiev, P., Lei, V. I., Burnell, R., Bai, L., Gulati, A., Tanzer, G., Vincent, D., Pan, Z., Wang, S., et al. (2024). Gemini 1.5: Unlocking multimodal understanding across millions of tokens of context. arXiv preprint arXiv:2403.05530.
Tu, Z., Talebi, H., Zhang, H., Yang, F., Milanfar, P., Bovik, A., and Li, Y. (2022). Maxvit: Multi-axis vision transformer. In European conference on computer vision, pages 459–479. Springer.
Wei, M., Wu, Q., Ji, H., Wang, J., Lyu, T., Liu, J., and Zhao, L. (2023). A skin disease classification model based on densenet and convnext fusion. Electronics, 12(2):438.
Wightman, R. (2019). Pytorch image models. [link].
Zhao, S., Xie, B., Li, Y., Zhao, X.-y., Kuang, Y., Su, J., He, X.-y., Wu, X., Fan, W., Huang, K., et al. (2020). Smart identification of psoriasis by images using convolutional neural networks: a case study in china. Journal of the European Academy of Dermatology and Venereology, 34(3):518–524.
Atlas Dermatologico (2024). Dermatology atlas. [link]. Acesso em: 19/04/2024.
Bin Ji, Yiyi Wang, D. Z. (2022). Automatic detection and evaluation of nail psoriasis based on deep learning: a preliminary application and exploration. SPIE International Conference on Computer Application and Information Security.
Brasil, A. (2020). Estudo mostra que mais de 90% da população desconhecem a psoríase. [link]. Acesso em: 29/04/2024.
Chollet, F. (2017). Xception: Deep learning with depthwise separable convolutions. In Proceedings of the IEEE conference on computer vision and pattern recognition, pages 1251–1258.
Danderm (2017). Atlas of dermatology. [link]. Acesso em: 17/04/2024.
Dash, M. et al. (2020). A cascaded deep convolution neural network based cadx system for psoriasis lesion segmentation and severity assessment. Applied Soft Computing.
de Dermatologia, S. B. (2023). O que é a psoríase? [link]. Acesso em: 29/04/2024.
Deng, J., Dong, W., Socher, R., Li, L.-J., Li, K., and Fei-Fei, L. (2009). Imagenet: A large-scale hierarchical image database. In 2009 IEEE conference on computer vision and pattern recognition, pages 248–255. Ieee.
Dermatoweb (2002). Web docente de dermatologia. [link]. Acesso em: 22/04/2024.
DermIS (2022). Dermatology information system. [link]. Acesso em: 26/04/2024.
DermNetNZ (2024). The worlds leading free dermatology website. [link]. Acesso em: 29/04/2024.
Ding, M., Xiao, B., Codella, N., Luo, P., Wang, J., and Yuan, L. (2022). Davit: Dual attention vision transformers. In European conference on computer vision, pages 74–92. Springer.
Dozat, T. (2016). Incorporating nesterov momentum into adam. In Proceedings of the 4th International Conference on Learning Representations, Workshop Track, pages 1–4.
Dubey, A., Jauhri, A., Pandey, A., Kadian, A., Al-Dahle, A., Letman, A., Mathur, A., Schelten, A., Yang, A., Fan, A., et al. (2024). The llama 3 herd of models. arXiv preprint arXiv:2407.21783.
Géron, A. (2019). Mãos À Obra: Aprendizado de Máquina com Scikit-Learn E TensorFlow. Alta Books.
He, K., Zhang, X., Ren, S., and Sun, J. (2016). Deep residual learning for image recognition. In Proceedings of the IEEE conference on computer vision and pattern recognition, pages 770–778.
Hellenic Dermatological Atlas (2011). For health professionals and public. [link]. Acesso em: 16/04/2024.
Huang, G., Liu, Z., Van Der Maaten, L., and Weinberger, K. Q. (2017). Densely connected convolutional networks. In Proceedings of the IEEE conference on computer vision and pattern recognition, pages 4700–4708.
Hurst, A., Lerer, A., Goucher, A. P., Perelman, A., Ramesh, A., Clark, A., Ostrow, A., Welihinda, A., Hayes, A., Radford, A., et al. (2024). Gpt-4o system card. arXiv preprint arXiv:2410.21276.
Huynh, N. Q., Xu, X., Kong, A. W. K., and Subbiah, S. (2014). A preliminary report on a full-body imaging system for effectively collecting and processing biometric traits of prisoners. In 2014 IEEE Symposium on Computational Intelligence in Biometrics and Identity Management (CIBIM).
Kingma, D. and Ba, J. (2014). Adam: A method for stochastic optimization. International Conference on Learning Representations.
Kohavi, R. (1995). A study of cross-validation and bootstrap for accuracy estimation and model selection. In Proceedings of the 14th International Joint Conference on Artificial Intelligence - Volume 2. Morgan Kaufmann Publishers Inc.
LeCun, Y., Denker, J., and Solla, S. (1989). Optimal brain damage. Advances in neural information processing systems, 2.
Liu, Z., Lin, Y., Cao, Y., Hu, H., Wei, Y., Zhang, Z., Lin, S., and Guo, B. (2021). Swin transformer: Hierarchical vision transformer using shifted windows. In Proceedings of the IEEE/CVF international conference on computer vision, pages 10012–10022.
Liu, Z., Mao, H., Wu, C.-Y., Feichtenhofer, C., Darrell, T., and Xie, S. (2022). A convnet for the 2020s. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pages 11976–11986.
Luque, A., Carrasco, A., Martín, A., and de Las Heras, A. (2019). The impact of class imbalance in classification performance metrics based on the binary confusion matrix. Pattern Recognition, 91:216–231.
Maduranga, P. and Nandasena, D. (2022). Mobile-based skin disease diagnosis system using convolutional neural networks (cnn). International Journal of Image, Graphics and Signal Processing, 14:47–57.
Marcel, S. and Rodriguez, Y. (2010). Torchvision the machine-vision package of torch. [link].
Meng, X. (2013). Scalable simple random sampling and stratified sampling. In Dasgupta, S. and McAllester, D., editors, Proceedings of the 30th International Conference on Machine Learning, volume 28 of Proceedings of Machine Learning Research, pages 531–539, Atlanta, Georgia, USA. PMLR.
Milani, A. et al. (2023). A deep learning application for psoriasis detection. Anais do Encontro Nacional de Inteligência Artificial e Computacional (ENIAC), pages 315–329.
Mohan, J., Sivasubramanian, A., Sowmya, V., and Vinayakumar, R. (2024). Enhancing skin disease classification leveraging transformer-based deep learning architectures and explainable ai. arXiv preprint arXiv:2407.14757.
Olescki, G. (2021). Detecção de tromboembolia pulmonar utilizando redes neurais convolucionais e extração de características. Anais do XXI Simpósio Brasileiro de Computação Aplicada à Saúde, pages 381–391.
Oquab, M., Darcet, T., Moutakanni, T., Vo, H., Szafraniec, M., Khalidov, V., Fernandez, P., Haziza, D., Massa, F., El-Nouby, A., et al. (2023). Dinov2: Learning robust visual features without supervision. arXiv preprint arXiv:2304.07193.
Rodrigues, A. P. and Teixeira, R. M. (2009). Desvendando a psoríase. RBAC.
Roslan, R., Razly, I., Sabri, B., and Ibrahim, Z. (2020). Evaluation of psoriasis skin disease classification using convolutional neural network. IAES International Journal of Artificial Intelligence (IJ-AI), 9:349.
Silva, G. et al. (2022). Cardiac arrhythmia detection in ecg signals using graph convolutional network. Anais do XXII Simpósio Brasileiro de Computação Aplicada à Saúde, pages 25–35.
Simonyan, K. (2014). Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556.
Szegedy, C., Vanhoucke, V., Ioffe, S., Shlens, J., and Wojna, Z. (2016). Rethinking the inception architecture for computer vision. In Proceedings of the IEEE conference on computer vision and pattern recognition, pages 2818–2826.
Tan, M. and Le, Q. (2021). Efficientnetv2: Smaller models and faster training. In International conference on machine learning, pages 10096–10106. PMLR.
Team, G., Georgiev, P., Lei, V. I., Burnell, R., Bai, L., Gulati, A., Tanzer, G., Vincent, D., Pan, Z., Wang, S., et al. (2024). Gemini 1.5: Unlocking multimodal understanding across millions of tokens of context. arXiv preprint arXiv:2403.05530.
Tu, Z., Talebi, H., Zhang, H., Yang, F., Milanfar, P., Bovik, A., and Li, Y. (2022). Maxvit: Multi-axis vision transformer. In European conference on computer vision, pages 459–479. Springer.
Wei, M., Wu, Q., Ji, H., Wang, J., Lyu, T., Liu, J., and Zhao, L. (2023). A skin disease classification model based on densenet and convnext fusion. Electronics, 12(2):438.
Wightman, R. (2019). Pytorch image models. [link].
Zhao, S., Xie, B., Li, Y., Zhao, X.-y., Kuang, Y., Su, J., He, X.-y., Wu, X., Fan, W., Huang, K., et al. (2020). Smart identification of psoriasis by images using convolutional neural networks: a case study in china. Journal of the European Academy of Dermatology and Venereology, 34(3):518–524.
Published
2025-07-20
How to Cite
LUCENA, Natanael; SILVA, Fabio S. da; RIOS, Ricardo.
Psoriasis Detection Using Computer Vision: A Comparative Approach Between CNNs and Vision Transformers. In: INTEGRATED SOFTWARE AND HARDWARE SEMINAR (SEMISH), 52. , 2025, Maceió/AL.
Anais [...].
Porto Alegre: Sociedade Brasileira de Computação,
2025
.
p. 85-96.
ISSN 2595-6205.
DOI: https://doi.org/10.5753/semish.2025.7332.
