Performance analysis of machine learning algorithms trained on biased data
Resumo
The use of Artificial Intelligence and Machine Learning algorithms in everyday life is common nowadays in several areas, bringing many possibilities and benefits to society. However, since there is room for learning algorithms to make decisions, the range of related ethical issues was also expanded. There are many complaints about Machine Learning applications that identify some kind of bias, disadvantaging or favoring some group, with the possibility of causing harm to a real person. The present work aims to shed light on the existence of biases, analyzing and comparing the behavior of different learning algorithms – namely Decision Tree, MLP, Naive Bayes, Random Forest, Logistic Regression and SVM – when being trained using biased data. We employed pre-processing algorithms for mitigating bias provided by IBM's framework AI Fairness 360.
Referências
Angwin, J., Larson, J., Mattu, S., and Kirchner, L. (2019). Machine bias: There’s software used across the country to predict future criminals and it’s biased against blacks. 2016. URL [link].
Bellamy, R. K., Dey, K., Hind, M., Hoffman, S. C., Houde, S., Kannan, K., Lohia, P., Martino, J., Mehta, S., Mojsilovic, A., et al. (2018). Ai fairness 360: An extensible toolkit for detecting, understanding, and mitigating unwanted algorithmic bias. arXiv preprint arXiv:1810.01943.
Bishop, C. M. (2006). Pattern recognition and machine learning. springer.
Brodersen, K. H., Ong, C. S., Stephan, K. E., and Buhmann, J. M. (2010). The balanced In 2010 20th international conference on accuracy and its posterior distribution. pattern recognition, pages 3121–3124. IEEE.
Feldman, M., Friedler, S. A., Moeller, J., Scheidegger, C., and Venkatasubramanian, S. (2015). Certifying and removing disparate impact. In proceedings of the 21th ACM SIGKDD international conference on knowledge discovery and data mining, pages 259–268.
Howard, A. and Borenstein, J. (2018). The ugly truth about ourselves and our robot creations: the problem of bias and social inequity. Science and engineering ethics, 24(5):1521–1536.
Kamiran, F. and Calders, T. (2012). Data preprocessing techniques for classification without discrimination. Knowledge and Information Systems, 33(1):1–33.
Kohavi, R. and Becker, B. (1994). UCI machine learning repository.
Maybury, M. T. (1990). The mind matters: artificial intelligence and its societal implications. IEEE Technology and Society Magazine, 9(2):7–15.
Mehrabi, N., Morstatter, F., Saxena, N., Lerman, K., and Galstyan, A. (2019). A survey on bias and fairness in machine learning. arXiv preprint arXiv:1908.09635.