Evaluation of methods of counterfactual explanation - A qualitative and quantitative analysis
Resumo
There is currently a growing concern about the explainability of machine learning algorithms. Explainability refers to the ability to understand and interpret the decisions made by the models, that is, the process by which a model arrives at a given prediction or classification. The counterfactual explanation involves creating alternative examples where the models prediction differs from the original. This work aims to raise and discuss essential features in the context of counterfactual explanation methods. For this, the CSSE and LORE methods will be evaluated and applied to twelve public databases, considering different characteristics regarding the number of attributes and data types. In this way, we can better understand their strengths and weaknesses using standardized metrics for different methods. This facilitates the selection and development of more effective strategies and helps to identify cases where one approach may outperform another regarding the quality of explanations. The survey measured the metrics validity, prolixity, sparsity, similarity, and hit rate. In general terms, the CSSE performed better in these metrics, except for sparsity.
Palavras-chave:
Machine Learning, Counterfactual Explanation, LORE, CSSE
Referências
Balbino, M. d. S., Zárate, L. E. G., and Nobre, C. N. Csse - an agnostic method of counterfactual, selected, and social explanations for classification models. Expert Systems with Applications, 2023.
El Shawi, R., Sherif, Y., Al-Mallah, M., and Sakr, S. Interpretability in healthcare a comparative study of local machine learning interpretability techniques. In 2019 IEEE 32nd International Symposium on Computer-Based Medical Systems (CBMS). IEEE, pp. 275–280, 2019.
Guidotti, R. Counterfactual explanations and how to find them: literature review and benchmarking. Data Mining and Knowledge Discovery, 2022.
Guidotti, R., Monreale, A., Giannotti, F., Pedreschi, D., Ruggieri, S., and Turini, F. Factual and counterfactual explanations for black box decision making. IEEE Intelligent Systems 34 (6): 14–23, 2019.
Guidotti, R., Monreale, A., Ruggieri, S., Turini, F., Giannotti, F., and Pedreschi, D. A survey of methods for explaining black box models. ACM Comput. Surv. 51 (5), aug, 2018.
Mothilal, R. K., Sharma, A., and Tan, C. Explaining machine learning classifiers through diverse counterfactual explanations. In Proceedings of the 2020 conference on fairness, accountability, and transparency. pp. 607–617, 2020.
Vermeire, T., Brughmans, D., Goethals, S., de Oliveira, R. M. B., and Martens, D. Explainable image classification with evidence counterfactual. Pattern Analysis and Applications 25 (2): 315–335, May, 2022.
Wachter, S., Mittelstadt, B., and Russell, C. Counterfactual explanations without opening the black box: Automated decisions and the gdpr. Harv. JL & Tech. vol. 31, pp. 841, 2017.
Zhang, C. and Lu, Y. Study on artificial intelligence: The state of the art and future prospects. Journal of Industrial Information Integration vol. 23, pp. 100224, 2021.
Zhang, X., Solar-Lezama, A., and Singh, R. Interpreting neural network judgments via minimal, stable, and symbolic corrections. Advances in neural information processing systems vol. 31, 2018.
El Shawi, R., Sherif, Y., Al-Mallah, M., and Sakr, S. Interpretability in healthcare a comparative study of local machine learning interpretability techniques. In 2019 IEEE 32nd International Symposium on Computer-Based Medical Systems (CBMS). IEEE, pp. 275–280, 2019.
Guidotti, R. Counterfactual explanations and how to find them: literature review and benchmarking. Data Mining and Knowledge Discovery, 2022.
Guidotti, R., Monreale, A., Giannotti, F., Pedreschi, D., Ruggieri, S., and Turini, F. Factual and counterfactual explanations for black box decision making. IEEE Intelligent Systems 34 (6): 14–23, 2019.
Guidotti, R., Monreale, A., Ruggieri, S., Turini, F., Giannotti, F., and Pedreschi, D. A survey of methods for explaining black box models. ACM Comput. Surv. 51 (5), aug, 2018.
Mothilal, R. K., Sharma, A., and Tan, C. Explaining machine learning classifiers through diverse counterfactual explanations. In Proceedings of the 2020 conference on fairness, accountability, and transparency. pp. 607–617, 2020.
Vermeire, T., Brughmans, D., Goethals, S., de Oliveira, R. M. B., and Martens, D. Explainable image classification with evidence counterfactual. Pattern Analysis and Applications 25 (2): 315–335, May, 2022.
Wachter, S., Mittelstadt, B., and Russell, C. Counterfactual explanations without opening the black box: Automated decisions and the gdpr. Harv. JL & Tech. vol. 31, pp. 841, 2017.
Zhang, C. and Lu, Y. Study on artificial intelligence: The state of the art and future prospects. Journal of Industrial Information Integration vol. 23, pp. 100224, 2021.
Zhang, X., Solar-Lezama, A., and Singh, R. Interpreting neural network judgments via minimal, stable, and symbolic corrections. Advances in neural information processing systems vol. 31, 2018.
Publicado
26/09/2023
Como Citar
KRAUSS, Omar F. de P. e; BALBINO, Marcelo de S.; NOBRE, Cristiane N..
Evaluation of methods of counterfactual explanation - A qualitative and quantitative analysis. In: SYMPOSIUM ON KNOWLEDGE DISCOVERY, MINING AND LEARNING (KDMILE), 11. , 2023, Belo Horizonte/MG.
Anais [...].
Porto Alegre: Sociedade Brasileira de Computação,
2023
.
p. 9-16.
ISSN 2763-8944.
DOI: https://doi.org/10.5753/kdmile.2023.232932.