Machine Learning post-hoc interpretability: a systematic mapping study
Resumo
Context: In the pre-algorithm world, humans and organizations made decisions in hiring and criminal sentencing. Nowadays, some of these decisions are entirely made or influenced by Machine Learning algorithms. Problem: Research is starting to reveal some troubling examples in which the reality of algorithmic decision-making runs the risk of replicating and even amplifying human biases. Along with that, most algorithmic decision systems are opaque and not interpretable - which makes it more difficult to detect potential biases and mitigate them. Solution: This paper reports an overview of the current literature on machine learning interpretability. IS Theory: This work was conceived under the aegis of the Sociotechnical theory. Artificial Intelligence systems can only be understood and improved if both ‘social’ and ‘technical’ aspects are brought together and treated as interdependent parts of a complex system. Method: The overview presented in this article has resulted from a systematic mapping study. Summary of Results: We find that, currently, the majority of XAI studies are not for end-users affected by the model but rather for data scientists who use explainability as a debugging tool. There is thus a gap in the quality assessment and deployment of interpretable methods. Contributions and Impact in the IS area: The main contribution of the paper is to serve as the motivating background for a series of challenges faced by XAI, such as combining different interpretable methods, evaluating interpretability, and building human-centered methods. We end by discussing concerns raised regarding explainability and presenting a series of questions that can serve as an agenda for future research in the field.
Palavras-chave:
xai, machine learning, explainability, interpretability, fairness, black-box
Referências
David Alvarez-Melis and Tommi S. Jaakkola. 2018. On the Robustness of Interpretability Methods. CoRR abs/1806.08049(2018), 6. arxiv:1806.08049 http://arxiv.org/abs/1806.08049
Alejandro Barredo Arrieta, Natalia Díaz-Rodríguez, Javier Del Ser, Adrien Bennetot, Siham Tabik, Alberto Barbado, Salvador Garcia, Sergio Gil-Lopez, Daniel Molina, Richard Benjamins, Raja Chatila, and Francisco Herrera. 2020. Explainable Artificial Intelligence (XAI): Concepts, taxonomies, opportunities and challenges toward responsible AI. Information Fusion 58(2020), 82–115. https://doi.org/10.1016/j.inffus.2019.12.012
Umang Bhatt, Alice Xiang, Shubham Sharma, Adrian Weller, Ankur Taly, Yunhan Jia, Joydeep Ghosh, Ruchir Puri, José M. F. Moura, and Peter Eckersley. 2020. Explainable Machine Learning in Deployment. arxiv:1909.06342 [cs.LG]
Adrien Bibal and Benoît Frenay. 2016. Interpretability of Machine Learning Models and Representations: an Introduction. In 24th European Symposium on Artificial Neural Networks, Computational Intelligence and Machine Learning, Michel Verleysen (Ed.). CIACO, Bruges, Belgium, 77–82. 24th European Symposium on Artificial Neural Networks, Computational Intelligence and Machine Learning, ESANN 2016 ; Conference date: 27-04-2016 Through 29-05-2016.
Adam Bloniarz, Ameet Talwalkar, Bin Yu, and Christopher Wu. 2016. Supervised Neighborhoods for Distributed Nonparametric Regression. In Proceedings of the 19th International Conference on Artificial Intelligence and Statistics(Proceedings of Machine Learning Research, Vol. 51), Arthur Gretton and Christian C. Robert (Eds.). PMLR, Cadiz, Spain, 1450–1459. https://proceedings.mlr.press/v51/bloniarz16.html
Diogo V. Carvalho, Eduardo M. Pereira, and Jaime S. Cardoso. 2019. Machine Learning Interpretability: A Survey on Methods and Metrics. Electronics 8, 8 (2019), 34. https://doi.org/10.3390/electronics8080832
Mark W. Craven and Jude W. Shavlik. 1995. Extracting Tree-structured Representations of Trained Networks. In Proceedings of the 8th International Conference on Neural Information Processing Systems (Denver, Colorado) (NIPS’95). MIT Press, Cambridge, MA, USA, 24–30. Available at http://dl.acm.org/citation.cfm?id=2998828.2998832.
Mengnan Du, Ninghao Liu, and Xia Hu. 2019. Techniques for Interpretable Machine Learning. arxiv:1808.00033 [cs.LG]
Benjamin P. Evans, Bing Xue, and Mengjie Zhang. 2019. What's inside the Black-Box? A Genetic Programming Method for Interpreting Complex Machine Learning Models. In Proceedings of the Genetic and Evolutionary Computation Conference (Prague, Czech Republic) (GECCO ’19). Association for Computing Machinery, New York, NY, USA, 1012–1020. https://doi.org/10.1145/3321707.3321726
Timnit Gebru. 2019. Oxford Handbook on AI Ethics Book Chapter on Race and Gender. arxiv:1908.06165 [cs.CY]
Leilani H. Gilpin, David Bau, Ben Z. Yuan, Ayesha Bajwa, Michael Specter, and Lalana Kagal. 2019. Explaining Explanations: An Overview of Interpretability of Machine Learning. arxiv:1806.00069 [cs.AI]
Brynce Goodman and Seth Flaxman. 2017. European Union Regulations on Algorithmic Decision-Making and a “Right to Explanation”. AI Magazine 38, 3 (02 Oct 2017), 50–57. https://doi.org/10.1609/aimag.v38i3.2741
Riccardo Guidotti, Anna Monreale, Salvatore Ruggieri, Franco Turini, Dino Pedreschi, and Fosca Giannotti. 2018. A Survey Of Methods For Explaining Black Box Models. arxiv:1802.01933 [cs.CY] https://arxiv.org/abs/1802.01933
David Gunning. 2017. Explainable Artificial Intelligence (XAI). DARPA. Available at https://www.darpa.mil/attachments/XAIProgramUpdate.pdf.
Mark Ibrahim, Melissa Louie, Ceena Modarres, and John Paisley. 2019. Global Explanations of Neural Networks: Mapping the Landscape of Predictions. In Proceedings of the 2019 AAAI/ACM Conference on AI, Ethics, and Society (Honolulu, HI, USA) (AIES ’19). Association for Computing Machinery, New York, NY, USA, 279–287. https://doi.org/10.1145/3306618.3314230
Ulf Johansson, Rikard König, and Lars Niklasson. 2010. Genetic Rule Extraction Optimizing Brier Score. In Proceedings of the 12th Annual Conference on Genetic and Evolutionary Computation (Portland, Oregon, USA) (GECCO ’10). Association for Computing Machinery, New York, NY, USA, 1007–1014. https://doi.org/10.1145/1830483.1830668
Jalil Kazemitabar, Arash Amini, Adam Bloniarz, and Ameet S Talwalkar. 2017. Variable Importance Using Decision Trees. In Advances in Neural Information Processing Systems 30, I. Guyon, U. V. Luxburg, S. Bengio, H. Wallach, R. Fergus, S. Vishwanathan, and R. Garnett (Eds.). Curran Associates, Inc., Long Beach, California, USA, 426–435. http://papers.nips.cc/paper/6646-variable-importance-using-decision-trees.pdf
Himabindu Lakkaraju, Ece Kamar, Rich Caruana, and Jure Leskovec. 2019. Faithful and Customizable Explanations of Black Box Models. In Proceedings of the 2019 AAAI/ACM Conference on AI, Ethics, and Society (Honolulu, HI, USA) (AIES ’19). Association for Computing Machinery, New York, NY, USA, 131–138. https://doi.org/10.1145/3306618.3314229
Thibault Laugel, Marie-Jeanne Lesot, Christophe Marsala, Xavier Renard, and Marcin Detyniecki. 2019. The Dangers of Post-hoc Interpretability: Unjustified Counterfactual Explanations. arxiv:1907.09294 [cs.LG]
Thibault Laugel, Marie-Jeanne Lesot, Christophe Marsala, Xavier Renard, and Marcin Detyniecki. 2019. Unjustified Classification Regions and Counterfactual Explanations In Machine Learning. In Machine Learning and Knowledge Discovery in Databases. ECML PKDD 2019(Lecture Notes in Computer Science, Vol. 11907). Springer, Würzburg, Germany, 37–54. https://doi.org/10.1007/978-3-030-46147-8_3
Ana Lucic, Hinda Haned, and Maarten de Rijke. 2020. Why Does My Model Fail? Contrastive Local Explanations for Retail Forecasting. In Proceedings of the 2020 Conference on Fairness, Accountability, and Transparency (Barcelona, Spain) (FAT* ’20). Association for Computing Machinery, New York, NY, USA, 90–98. https://doi.org/10.1145/3351095.3372824
Andreas Messalas, Yiannis Kanellopoulos, and Christos Makris. 2019. Model-Agnostic Interpretability with Shapley Values. In 2019 10th International Conference on Information, Intelligence, Systems and Applications (IISA). IEEE, Patras, Greece, 1–7.
Sina Mohseni, Niloofar Zarei, and Eric D. Ragan. 2020. A Multidisciplinary Survey and Framework for Design and Evaluation of Explainable AI Systems. arxiv:1811.11839 [cs.HC]
Christoph Molnar. 2017. Interpretable Machine Learning. https://christophm.github.io/interpretable-ml-book
Ramaravind K. Mothilal, Amit Sharma, and Chenhao Tan. 2020. Explaining Machine Learning Classifiers through Diverse Counterfactual Explanations. In Proceedings of the 2020 Conference on Fairness, Accountability, and Transparency (Barcelona, Spain) (FAT* ’20). Association for Computing Machinery, New York, NY, USA, 607–617. https://doi.org/10.1145/3351095.3372850
Cathy O'Neil. 2016. Weapons of Math Destruction: How Big Data Increases Inequality and Threatens Democracy. Crown Publishing Group, USA.
Gregory Plumb, Denali Molitor, and Ameet Talwalkar. 2018. Model Agnostic Supervised Local Explanations. In Proceedings of the 32nd International Conference on Neural Information Processing Systems (Montréal, Canada) (NIPS’18). Curran Associates Inc., Red Hook, NY, USA, 2520–2529.
Marco Tulio Ribeiro, Sameer Singh, and Carlos Guestrin. 2016. Model-Agnostic Interpretability of Machine Learning. ArXiv abs/1606.05386(2016), 5.
Marco Tulio Ribeiro, Sameer Singh, and Carlos Guestrin. 2016. ”Why Should I Trust You?”: Explaining the Predictions of Any Classifier. In Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (San Francisco, California, USA) (KDD ’16). Association for Computing Machinery, New York, NY, USA, 1135–1144. https://doi.org/10.1145/2939672.2939778
Mireia Ribera and Agata Lapedriza. 2019. Can we do better explanations? A proposal of User-Centered Explainable AI.
Shubham Sharma, Jette Henderson, and Joydeep Ghosh. 2020. CERTIFAI: A Common Framework to Provide Explanations and Analyse the Fairness and Robustness of Black-Box Models. In Proceedings of the AAAI/ACM Conference on AI, Ethics, and Society (New York, NY, USA) (AIES ’20). Association for Computing Machinery, New York, NY, USA, 166–172. https://doi.org/10.1145/3375627.3375812
Shohei Shirataki and Saneyasu Yamaguchi. 2017. A study on interpretability of decision of machine learning. In 2017 IEEE International Conference on Big Data (Big Data). IEEE, Boston, MA, USA, 4830–4831. https://doi.org/10.1109/BigData.2017.8258557
Dylan Slack, Friedler, Sorelle A., and Emile Givental. 2020. Fairness Warnings and Fair-MAML: Learning Fairly with Minimal Data. In Proceedings of the 2020 Conference on Fairness, Accountability, and Transparency (Barcelona, Spain) (FAT* ’20). Association for Computing Machinery, New York, NY, USA, 200–209. https://doi.org/10.1145/3351095.3372839
Dylan Slack, Sophie Hilgard, Emily Jia, Sameer Singh, and Himabindu Lakkaraju. 2020. Fooling LIME and SHAP: Adversarial Attacks on Post Hoc Explanation Methods. In Proceedings of the AAAI/ACM Conference on AI, Ethics, and Society (New York, NY, USA) (AIES ’20). Association for Computing Machinery, New York, NY, USA, 180–186. https://doi.org/10.1145/3375627.3375830
Sandra Wachter, Brent Mittelstadt, and Chris Russell. 2018. Counterfactual Explanations Without Opening the Black Box: Automated Decisions and the GDPR. Harvard journal of law & technology 31 (04 2018), 841–887.
Alejandro Barredo Arrieta, Natalia Díaz-Rodríguez, Javier Del Ser, Adrien Bennetot, Siham Tabik, Alberto Barbado, Salvador Garcia, Sergio Gil-Lopez, Daniel Molina, Richard Benjamins, Raja Chatila, and Francisco Herrera. 2020. Explainable Artificial Intelligence (XAI): Concepts, taxonomies, opportunities and challenges toward responsible AI. Information Fusion 58(2020), 82–115. https://doi.org/10.1016/j.inffus.2019.12.012
Umang Bhatt, Alice Xiang, Shubham Sharma, Adrian Weller, Ankur Taly, Yunhan Jia, Joydeep Ghosh, Ruchir Puri, José M. F. Moura, and Peter Eckersley. 2020. Explainable Machine Learning in Deployment. arxiv:1909.06342 [cs.LG]
Adrien Bibal and Benoît Frenay. 2016. Interpretability of Machine Learning Models and Representations: an Introduction. In 24th European Symposium on Artificial Neural Networks, Computational Intelligence and Machine Learning, Michel Verleysen (Ed.). CIACO, Bruges, Belgium, 77–82. 24th European Symposium on Artificial Neural Networks, Computational Intelligence and Machine Learning, ESANN 2016 ; Conference date: 27-04-2016 Through 29-05-2016.
Adam Bloniarz, Ameet Talwalkar, Bin Yu, and Christopher Wu. 2016. Supervised Neighborhoods for Distributed Nonparametric Regression. In Proceedings of the 19th International Conference on Artificial Intelligence and Statistics(Proceedings of Machine Learning Research, Vol. 51), Arthur Gretton and Christian C. Robert (Eds.). PMLR, Cadiz, Spain, 1450–1459. https://proceedings.mlr.press/v51/bloniarz16.html
Diogo V. Carvalho, Eduardo M. Pereira, and Jaime S. Cardoso. 2019. Machine Learning Interpretability: A Survey on Methods and Metrics. Electronics 8, 8 (2019), 34. https://doi.org/10.3390/electronics8080832
Mark W. Craven and Jude W. Shavlik. 1995. Extracting Tree-structured Representations of Trained Networks. In Proceedings of the 8th International Conference on Neural Information Processing Systems (Denver, Colorado) (NIPS’95). MIT Press, Cambridge, MA, USA, 24–30. Available at http://dl.acm.org/citation.cfm?id=2998828.2998832.
Mengnan Du, Ninghao Liu, and Xia Hu. 2019. Techniques for Interpretable Machine Learning. arxiv:1808.00033 [cs.LG]
Benjamin P. Evans, Bing Xue, and Mengjie Zhang. 2019. What's inside the Black-Box? A Genetic Programming Method for Interpreting Complex Machine Learning Models. In Proceedings of the Genetic and Evolutionary Computation Conference (Prague, Czech Republic) (GECCO ’19). Association for Computing Machinery, New York, NY, USA, 1012–1020. https://doi.org/10.1145/3321707.3321726
Timnit Gebru. 2019. Oxford Handbook on AI Ethics Book Chapter on Race and Gender. arxiv:1908.06165 [cs.CY]
Leilani H. Gilpin, David Bau, Ben Z. Yuan, Ayesha Bajwa, Michael Specter, and Lalana Kagal. 2019. Explaining Explanations: An Overview of Interpretability of Machine Learning. arxiv:1806.00069 [cs.AI]
Brynce Goodman and Seth Flaxman. 2017. European Union Regulations on Algorithmic Decision-Making and a “Right to Explanation”. AI Magazine 38, 3 (02 Oct 2017), 50–57. https://doi.org/10.1609/aimag.v38i3.2741
Riccardo Guidotti, Anna Monreale, Salvatore Ruggieri, Franco Turini, Dino Pedreschi, and Fosca Giannotti. 2018. A Survey Of Methods For Explaining Black Box Models. arxiv:1802.01933 [cs.CY] https://arxiv.org/abs/1802.01933
David Gunning. 2017. Explainable Artificial Intelligence (XAI). DARPA. Available at https://www.darpa.mil/attachments/XAIProgramUpdate.pdf.
Mark Ibrahim, Melissa Louie, Ceena Modarres, and John Paisley. 2019. Global Explanations of Neural Networks: Mapping the Landscape of Predictions. In Proceedings of the 2019 AAAI/ACM Conference on AI, Ethics, and Society (Honolulu, HI, USA) (AIES ’19). Association for Computing Machinery, New York, NY, USA, 279–287. https://doi.org/10.1145/3306618.3314230
Ulf Johansson, Rikard König, and Lars Niklasson. 2010. Genetic Rule Extraction Optimizing Brier Score. In Proceedings of the 12th Annual Conference on Genetic and Evolutionary Computation (Portland, Oregon, USA) (GECCO ’10). Association for Computing Machinery, New York, NY, USA, 1007–1014. https://doi.org/10.1145/1830483.1830668
Jalil Kazemitabar, Arash Amini, Adam Bloniarz, and Ameet S Talwalkar. 2017. Variable Importance Using Decision Trees. In Advances in Neural Information Processing Systems 30, I. Guyon, U. V. Luxburg, S. Bengio, H. Wallach, R. Fergus, S. Vishwanathan, and R. Garnett (Eds.). Curran Associates, Inc., Long Beach, California, USA, 426–435. http://papers.nips.cc/paper/6646-variable-importance-using-decision-trees.pdf
Himabindu Lakkaraju, Ece Kamar, Rich Caruana, and Jure Leskovec. 2019. Faithful and Customizable Explanations of Black Box Models. In Proceedings of the 2019 AAAI/ACM Conference on AI, Ethics, and Society (Honolulu, HI, USA) (AIES ’19). Association for Computing Machinery, New York, NY, USA, 131–138. https://doi.org/10.1145/3306618.3314229
Thibault Laugel, Marie-Jeanne Lesot, Christophe Marsala, Xavier Renard, and Marcin Detyniecki. 2019. The Dangers of Post-hoc Interpretability: Unjustified Counterfactual Explanations. arxiv:1907.09294 [cs.LG]
Thibault Laugel, Marie-Jeanne Lesot, Christophe Marsala, Xavier Renard, and Marcin Detyniecki. 2019. Unjustified Classification Regions and Counterfactual Explanations In Machine Learning. In Machine Learning and Knowledge Discovery in Databases. ECML PKDD 2019(Lecture Notes in Computer Science, Vol. 11907). Springer, Würzburg, Germany, 37–54. https://doi.org/10.1007/978-3-030-46147-8_3
Ana Lucic, Hinda Haned, and Maarten de Rijke. 2020. Why Does My Model Fail? Contrastive Local Explanations for Retail Forecasting. In Proceedings of the 2020 Conference on Fairness, Accountability, and Transparency (Barcelona, Spain) (FAT* ’20). Association for Computing Machinery, New York, NY, USA, 90–98. https://doi.org/10.1145/3351095.3372824
Andreas Messalas, Yiannis Kanellopoulos, and Christos Makris. 2019. Model-Agnostic Interpretability with Shapley Values. In 2019 10th International Conference on Information, Intelligence, Systems and Applications (IISA). IEEE, Patras, Greece, 1–7.
Sina Mohseni, Niloofar Zarei, and Eric D. Ragan. 2020. A Multidisciplinary Survey and Framework for Design and Evaluation of Explainable AI Systems. arxiv:1811.11839 [cs.HC]
Christoph Molnar. 2017. Interpretable Machine Learning. https://christophm.github.io/interpretable-ml-book
Ramaravind K. Mothilal, Amit Sharma, and Chenhao Tan. 2020. Explaining Machine Learning Classifiers through Diverse Counterfactual Explanations. In Proceedings of the 2020 Conference on Fairness, Accountability, and Transparency (Barcelona, Spain) (FAT* ’20). Association for Computing Machinery, New York, NY, USA, 607–617. https://doi.org/10.1145/3351095.3372850
Cathy O'Neil. 2016. Weapons of Math Destruction: How Big Data Increases Inequality and Threatens Democracy. Crown Publishing Group, USA.
Gregory Plumb, Denali Molitor, and Ameet Talwalkar. 2018. Model Agnostic Supervised Local Explanations. In Proceedings of the 32nd International Conference on Neural Information Processing Systems (Montréal, Canada) (NIPS’18). Curran Associates Inc., Red Hook, NY, USA, 2520–2529.
Marco Tulio Ribeiro, Sameer Singh, and Carlos Guestrin. 2016. Model-Agnostic Interpretability of Machine Learning. ArXiv abs/1606.05386(2016), 5.
Marco Tulio Ribeiro, Sameer Singh, and Carlos Guestrin. 2016. ”Why Should I Trust You?”: Explaining the Predictions of Any Classifier. In Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (San Francisco, California, USA) (KDD ’16). Association for Computing Machinery, New York, NY, USA, 1135–1144. https://doi.org/10.1145/2939672.2939778
Mireia Ribera and Agata Lapedriza. 2019. Can we do better explanations? A proposal of User-Centered Explainable AI.
Shubham Sharma, Jette Henderson, and Joydeep Ghosh. 2020. CERTIFAI: A Common Framework to Provide Explanations and Analyse the Fairness and Robustness of Black-Box Models. In Proceedings of the AAAI/ACM Conference on AI, Ethics, and Society (New York, NY, USA) (AIES ’20). Association for Computing Machinery, New York, NY, USA, 166–172. https://doi.org/10.1145/3375627.3375812
Shohei Shirataki and Saneyasu Yamaguchi. 2017. A study on interpretability of decision of machine learning. In 2017 IEEE International Conference on Big Data (Big Data). IEEE, Boston, MA, USA, 4830–4831. https://doi.org/10.1109/BigData.2017.8258557
Dylan Slack, Friedler, Sorelle A., and Emile Givental. 2020. Fairness Warnings and Fair-MAML: Learning Fairly with Minimal Data. In Proceedings of the 2020 Conference on Fairness, Accountability, and Transparency (Barcelona, Spain) (FAT* ’20). Association for Computing Machinery, New York, NY, USA, 200–209. https://doi.org/10.1145/3351095.3372839
Dylan Slack, Sophie Hilgard, Emily Jia, Sameer Singh, and Himabindu Lakkaraju. 2020. Fooling LIME and SHAP: Adversarial Attacks on Post Hoc Explanation Methods. In Proceedings of the AAAI/ACM Conference on AI, Ethics, and Society (New York, NY, USA) (AIES ’20). Association for Computing Machinery, New York, NY, USA, 180–186. https://doi.org/10.1145/3375627.3375830
Sandra Wachter, Brent Mittelstadt, and Chris Russell. 2018. Counterfactual Explanations Without Opening the Black Box: Automated Decisions and the GDPR. Harvard journal of law & technology 31 (04 2018), 841–887.
Publicado
16/05/2022
Como Citar
VIEIRA, Carla Piazzon; DIGIAMPIETRI, Luciano Antonio.
Machine Learning post-hoc interpretability: a systematic mapping study. In: SIMPÓSIO BRASILEIRO DE SISTEMAS DE INFORMAÇÃO (SBSI), 18. , 2022, Curitiba.
Anais [...].
Porto Alegre: Sociedade Brasileira de Computação,
2022
.