Machine-based Stereotypes: How Machine Learning Algorithms Evaluate Ethnicity from Face Data


Context: Soft biometrics is a field that aids traditional biometrics through attribute identification using descriptors such as hair type, ethnicity, or gender. When employed to large search sets, allows optimization of the search space, thus increasing performance. Problem: Ethnicity classification is controversial due to unclear or stereotypical definitions used for classifying individuals, producing incorrect results. While there are many applications that use face images to ascertain ethnicity, no extensive study has been done to map these approaches. Solution: A literature survey of the strategies, data, and results used for ethnicity classification is employed to understand how the issue is being treated in the literature. IS Theory: Considering the societal nature of ethnicity, applications that evaluate ethnicity are subjected to Social Shaping Of The Technology. Investigating these applications can help identify the societal influence of those technologies. Methods: We present a systematic literature review (SLR) to map the algorithms, types, sources of the data, and results obtained for classifying ethnicity based on the human face. Summary of Results: Many algorithms are currently used to ascertain ethnicity from face images. These algorithms may use information from the whole face or opt to use partial regions of the face. The strategies may use symbolic data about the face, like face measurements, or use subsymbolic data. The classification performance appears to be stagnated. However, datasets overrepresent White, Black, and Asian subjects, with few datasets being balanced. Contributions and Impacts in the IS area: This study offers an up-to-date look at how ethnicity classification is done, the data used, and the way society may be shaping technology.

Palavras-chave: Race and Ethnicity, Ethnicity Classification, Machine Learning, Fairness


