Interpreting BERT-based stance classification: a case study about the Brazilian COVID vaccination

Carlos Abel Córdova Sáenz; Karin Becker

doi:10.5753/sbbd.2021.17867

Carlos Abel Córdova Sáenz Universidade Federal do Rio Grande do Sul (UFRGS) http://orcid.org/0000-0002-1637-6710
Karin Becker Universidade Federal do Rio Grande do Sul (UFRGS) http://orcid.org/0000-0003-4967-1027

DOI: https://doi.org/10.5753/sbbd.2021.17867

Resumo

The actions to control the COVID-19 pandemics should be based on scientific facts. However, Brazil is facing a politically polarized scenario that has influenced the population’s behavior regarding social distance or vaccination issues. This paper addresses this subject by proposing a BERT-based stance classification model and an attention-based mechanism to identify the influential words for stance classification. The interpretation mechanism traces tokens’ attentions back to words, assigning word attention scores (absolute and relative). We use these metrics to assess if words with high attention weights correspond to domain intrinsic properties and contribute to the correct classification of stances. Our experiments revealed good results for stance classification (F1=0.752), and that 74% of the top-100 words with the highest absolute attention are representative of the arguments that support the investigated stances.

Palavras-chave: BERT, interpretability, stance classification, attention weights

Referências

Abnar, S. and Zuidema, W. (2020). Quantifying attention flow in transformers. In Proc. of the 58th Annual Meeting of the Association for Computational Linguistics, pages 4190–4197.

ALDayel, A. and Magdy, W. (2021). Stance detection on social media: State of the art and trends. Information Processing Management, 58(4):102597.

Chefer, H., Gur, S., and Wolf, L. (2021). Transformer interpretability beyond attention visualization. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pages 782–791.

Clark, J. H., Garrette, D., Turc, I., and Wieting, J. (2021). Canine: Pre-training an efficient tokenization-free encoder for language representation. arXiv preprint arXiv:2103.06874.

Devlin, J., Chang, M., Lee, K., and Toutanova, K. (2019). BERT: pre-training of deep bidirectional transformers for language understanding. In Proc. of the 2019 Conf. of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, (NAACL-HLT), pages 4171–4186.

Ebeling, R., Régis, C. C., Nobre, J. C., and Becker, K. (2022). Analysis of the influence of political polarization in the vaccination stance: the brazilian covid-19 scenario. In Proc. of the 15th Intl. Conference on Web and Social Media (ICWSM). To appear.

Ebeling, R., Sáenz, C. C., Nobre, J. C., and Becker, K. (2020). Quarenteners vs. cloroquiners: a framework to analyze the effect of political polarization on social distance stances. In Anais do VIII Symposium on Knowledge Discovery, Mining and Learning, pages 89–96. SBC.

Giorgioni, S., Politi, M., Salman, S., 0001, R. B., and Croce, D. (2020). Unitor @ sardistance2020: Combining transformer-based architectures and transfer learning for robust stance detection. In Proc. of the Seventh Evaluation Campaign of Natural Language Processing and Speech Tools for Italian. Final Workshop (EVALITA 2020), volume 2765 of CEUR Workshop Proceedings. CEUR-WS.org.

Jain, S. and Wallace, B. C. (2019). Attention is not Explanation. In Proc. of the 2019 Conf. of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1, pages 3543–3556.

Kawintiranon, K. and Singh, L. (2021). Knowledge enhanced masked language model for stance detection. In Proc. of the 2021 Conf. of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, pages 4725–4735.

Kokalj, E., Škrlj, B., Lavrač, N., Pollak, S., and Robnik-Šikonja, M. (2021). BERT meets shapley: Extending SHAP explanations to transformer-based classifiers. In Proc. of the EACL Hackashop on News Media Content Analysis and Automated Report Generation, pages 16–21.

Molnar, C. (2019). Interpretable Machine Learning. https://christophm.github.io/interpretable-ml-book/.

Rogers, A., Kovaleva, O., and Rumshisky, A. (2020). A primer in bertology: What we know about how bert works. Transactions of the Association for Computational Linguistics, 8:842–866.

Souza, F., Nogueira, R., and Lotufo, R. (2020). Bertimbau: Pretrained bert models for brazilian portuguese. In Cerri, R. and Prati, R. C., editors, Intelligent Systems, pages 403–417, Cham. Springer International Publishing.

Vig, J. (2019). A multiscale visualization of attention in the transformer model. In Proc. of the 57th Annual Meeting of the Association for Computational Linguistics: System Demonstrations, pages 37–42.

Wiegreffe, S. and Pinter, Y. (2019). Attention is not not explanation. In Proc. of the 2019 Conf. on Empirical Methods in Natural Language Processing and the 9th International Joint Conf. on Natural Language Processing (EMNLP-IJCNLP), pages 11–20.