Enhancing Fairness in Machine Learning: Skin Tone Classification Using the Monk Skin Tone Scale

  • Vitor Pereira Matias USP
  • João Batista Neto USP

Resumo


In the machine learning era, unethical errors from poorly curated datasets are a pressing issue, especially in fields related to skin tone recognition in which imbalanced datasets lead to biased results. Developing a skin tone classification algorithm helps identify such imbalances. Existing methods range from classic computer vision pipelines to deep learning CNNs that typically employs controlled environment datasets with limited class diversity (two to six classes). Our work focuses on classifying skin tones using the 10-class Monk Skin Tone (MST) scale. To this end, we created the SkinTone in The Wild (STW) dataset by merging well-known face recognition datasets and labelling it according to the MST. This dataset comprises 39,605 images of 2,183 individuals, mostly captured in uncontrolled environments. To overcome this scenario, we evaluated different approaches which resulted in 74% accuracy and 92% off-by-one accuracy (OOAcc) with a RandomForest model, and 68% accuracy along with 86% OOAcc using a DenseNet121 CNN. Furthermore we discussed the sheer power of CNNs and showed that the DenseNet121 architecture learned to predict skin tones by focusing on the background of images. These results highlight the potential for accurate skin tone classification in machine learning which leads to better curated datasets.

Referências

Jornal OGlobo, “Vídeo com saboneteira levanta debate sobre ‘tecnologias racistas’,” 2017. [Online]. Available: [link]

J. Zou and L. Schiebinger, “Ai can be sexist and racist—it’s time to make it fair,” 2018.

Repórter Unicamp, “Oxímetros podem apresentar menor precisão em pessoas negras,” 2021. [Online]. Available: [link]

C. O’neil, Weapons of math destruction: How big data increases inequality and threatens democracy. Crown, 2017.

Z. Liu, P. Luo, X. Wang, and X. Tang, “Large-scale celebfaces attributes (celeba) dataset,” Retrieved August, vol. 15, no. 2018, p. 11, 2018.

G. B. Huang, M. Mattar, H. Lee, and E. Learned-Miller, “Learning to align from scratch,” in NIPS, 2012.

T. B. Fitzpatrick, “Soleil et peau,” J. Med. Esthet., vol. 2, pp. 33–34, 1975.

E. Monk, “Monk skin tone scale,” 2019. [Online]. Available: [link]

C. M. Heldreth, E. P. Monk, A. T. Clark, C. Schumann, X. Eyee, and S. Ricco, “Which skin tone measures are the most inclusive? an investigation of skin tone measures for artificial intelligence,” ACM J. Responsib. Comput., vol. 1, no. 1, mar 2024. [Online]. Available: DOI: 10.1145/3632120

D. Borza, A. S. Darabant, and R. Danescu, “Automatic skin tone extraction for visagism applications.” in VISIGRAPP (4: VISAPP), 2018, pp. 466–473.

N. M. Kinyanjui, T. Odonga, C. Cintas, N. C. Codella, R. Panda, P. Sattigeri, and K. R. Varshney, “Estimating skin tone and effects on classification performance in dermatology datasets,” arXiv preprint arXiv:1910.13268, 2019.

M. Z. Osman, M. A. Maarof, M. F. Rohani, N. N. A. Sjarif, and N. S. A. Zulkifli, “A multi-color based features from facial images for automatic ethnicity identification model,” Indonesian Journal of Electrical Engineering and Computer Science, vol. 18, no. 3, pp. 1383– 1390, 2020.

M. Groh, C. Harris, L. Soenksen, F. Lau, R. Han, A. Kim, A. Koochek, and O. Badri, “Evaluating deep neural networks trained on clinical images in dermatology with the fitzpatrick 17k dataset,” in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2021, pp. 1820–1828.

S. Kye and O. Lee, “Skin color classification of koreans using clustering,” Skin Research and Technology, vol. 28, no. 6, pp. 796–803, 2022.

I. Boaventura, V. Volpe, I. da Silva, and A. Gonzaga, “Fuzzy classification of human skin color in color images,” in 2006 IEEE International Conference on Systems, Man and Cybernetics, vol. 6, 2006, pp. 5071–5075.

R. A. Rejón Piña and C. Ma, “Classification algorithm for skin color (casco): A new tool to measure skin color in social science research,” Social Science Quarterly, vol. 104, no. 2, pp. 168–179, 2023.

M. Sobhan, D. Leizaola, A. Godavarty, and A. M. Mondal, “Subject skin tone classification with implications in wound imaging using deep learning,” in 2022 International Conference on Computational Science and Computational Intelligence (CSCI). IEEE, 2022, pp. 1640–1645.

G. A. Tadesse, C. Cintas, K. R. Varshney, P. Staar, C. Agunwa, S. Speakman, J. Jia, E. E. Bailey, A. Adelekun, J. B. Lipoff et al., “Skin tone analysis for representation in educational materials (star-ed) using machine learning,” NPJ Digital Medicine, vol. 6, no. 1, p. 151, 2023.

K. Robin, T. Loı̈c, E. MALHERBE, and M. PERROT, “Beyond color correction: Skin color estimation in the wild through deep learning,” Electronic Imaging, vol. 32, pp. 1–8, 2020.

H. Choi, K. Choi, and H.-J. Suk, “Performance of the 14 skin-colored patches in accurately estimating human skin color,” in Electronic Imaging, Computational Imaging XV 2017. Society for Imaging Sciences and Technology, 2017, pp. 62–65.

A. Ward, J. Li, J. Wang, S. Lakshminarasimhan, A. Carrick, B. Campana, J. Hartford, P. K. S, T. Tiyasirichokchai, S. Virmani, R. Wong, Y. Matias, G. S. Corrado, D. R. Webster, D. Siegel, S. Lin, J. Ko, A. Karthikesalingam, C. Semturs, and P. Rao, “Crowdsourcing dermatology images with google search ads: Creating a real-world skin condition dataset,” 2024.

R. R. Selvaraju, M. Cogswell, A. Das, R. Vedantam, D. Parikh, and D. Batra, “Grad-cam: Visual explanations from deep networks via gradient-based localization,” in Proceedings of the IEEE international conference on computer vision, 2017, pp. 618–626.

L. Spacek, “Collection of facial images: Faces94 and faces95,” Computer Vision Science and Research Projects, University of Essex, United Kingdom, 1995. [Online]. Available: [link]

P. J. Phillips, H. Wechsler, J. Huang, and P. J. Rauss, “The feret database and evaluation procedure for face-recognition algorithms,” Image and vision computing, vol. 16, no. 5, pp. 295–306, 1998.

L. L. de Oliveira Junior and C. E. Thomaz, “Captura e alinhamento de imagens: Um banco de faces brasileiro,” Department of Electrical Engineering, FEI, São Bernardo do Campo, São Paulo, Brazil, Undergraduate Technical Report, June 2006.

K. Ricanek and T. Tesafaye, “Morph: a longitudinal image database of normal adult age-progression,” in 7th International Conference on Automatic Face and Gesture Recognition (FGR06), 2006, pp. 341–345.

Chinese Academy of Sciences, “Casia-facev5,” 2009. [Online]. Available: [link]

Z. Zhang, Y. Song, and H. Qi, “Age progression/regression by conditional adversarial autoencoder,” in Proceedings of the IEEE conference on computer vision and pattern recognition, 2017, pp. 5810–5818.

J. Muhammad, Y. Wang, C. Wang, K. Zhang, and Z. Sun, “Casia-face-africa: A large-scale african face image database,” IEEE Transactions on Information Forensics and Security, vol. 16, pp. 3634–3646, 2021.

R. F. Rachmadi and I. Purnama, “Vehicle color recognition using convolutional neural network,” arXiv preprint arXiv:1510.07391, 2015.

G. King and L. Zeng, “Logistic regression in rare events data,” Political analysis, vol. 9, no. 2, pp. 137–163, 2001.

C. Lugaresi, J. Tang, H. Nash, C. McClanahan, E. Uboweja, M. Hays, F. Zhang, C.-L. Chang, M. G. Yong, J. Lee, W.-T. Chang, W. Hua, M. Georg, and M. Grundmann, “Mediapipe: A framework for building perception pipelines,” 2019.

S. Baccianella, A. Esuli, and F. Sebastiani, “Evaluation measures for ordinal regression,” in 2009 Ninth international conference on intelligent systems design and applications. IEEE, 2009, pp. 283–287.

L. Gaudette and N. Japkowicz, “Evaluation methods for ordinal classification,” in Advances in Artificial Intelligence: 22nd Canadian Conference on Artificial Intelligence, Canadian AI 2009 Kelowna, Canada, May 25-27, 2009 Proceedings 22. Springer, 2009, pp. 207–210.

A. Buslaev, V. I. Iglovikov, E. Khvedchenya, A. Parinov, M. Druzhinin, and A. A. Kalinin, “Albumentations: fast and flexible image augmentations,” Information, vol. 11, no. 2, p. 125, 2020.
Publicado
30/09/2024
MATIAS, Vitor Pereira; BATISTA NETO, João. Enhancing Fairness in Machine Learning: Skin Tone Classification Using the Monk Skin Tone Scale. In: WORKSHOP DE TRABALHOS EM ANDAMENTO - CONFERENCE ON GRAPHICS, PATTERNS AND IMAGES (SIBGRAPI), 37. , 2024, Manaus/AM. Anais [...]. Porto Alegre: Sociedade Brasileira de Computação, 2024 . p. 76-81. DOI: https://doi.org/10.5753/sibgrapi.est.2024.31648.