Aprimorando a Eficiência e a Equidade de uma Abordagem Perspectivista Para Detecção de Ironia
Resumo
Text classification in tasks like hate speech or irony detection is culturally influenced and personally interpretive. Perspectivism, unlike approaches that aggregate opinions by, for instance, majority voting, values specific annotator groups to create fairer models but often involves high computational costs due to fine-tuning of language models. This work explores traditional machine learning methods (SVM, Random Forest, XGBoost) to reduce costs and uses calibration to address inference biases. Results show up to 12 times faster processing without statistical effectiveness loss and improved fairness through reduced bias.
Palavras-chave:
perspectivismo, redes neurais, combinação, detecção de ironia, calibração
Referências
Sohail Akhtar, Valerio Basile, and Viviana Patti. 2021. Whose Opinions Matter? Perspective-aware Models to Identify Opinions of Hate Speech Victims in Abusive Language Detection. CoRR abs/2106.15896 (2021). arXiv:2106.15896 [link]
L.M. Aroyo and C.A. Welty. 2015. Crowd Truth: Harnessing disagreement in crowdsourcing a relation extraction gold standard. In ACM Web Science 2013.
Valerio Basile, Michael Fell, Tommaso Fornaciari, Dirk Hovy, Silviu Paun, Barbara Plank, Massimo Poesio, and Alexandra Uma. 2021. We Need to Consider Disagreement in Evaluation. In Proceedings of the 1st Workshop on Benchmarking: Past, Present and Future, Kenneth Church, Mark Liberman, and Valia Kordoni (Eds.). Association for Computational Linguistics, Online, 15–21. DOI: 10.18653/v1/2021.bppf-1.3
Silvia Casola, Soda Marem Lo, Valerio Basile, Simona Frenda, Alessandra Teresa Cignarella, Viviana Patti, and Cristina Bosco. 2023. Confidence-based Ensembling of Perspective-aware Models. In Proceedings of the 2023 Conference on Empirical Methods in Natural Language Processing, Houda Bouamor, Juan Pino, and Kalika Bali (Eds.). Association for Computational Linguistics, Singapore, 3496–3507. DOI: 10.18653/v1/2023.emnlp-main.212
Eve Fleisig, Rediet Abebe, and Dan Klein. 2023. When the Majority is Wrong: Modeling Annotator Disagreement for Subjective Tasks. In Proceedings of the 2023 Conference on Empirical Methods in Natural Language Processing, Houda Bouamor, Juan Pino, and Kalika Bali (Eds.). Association for Computational Linguistics, Singapore, 6715–6726. DOI: 10.18653/v1/2023.emnlp-main.415
Simona Frenda, Gavin Abercrombie, Valerio Basile, Alessandro Pedrani, Raffaella Panizzon, Alessandra Teresa Cignarella, Cristina Marco, and Davide Bernardi. 2024. Perspectivist approaches to natural language processing: a survey. Language Resources and Evaluation (2024), 1–28.
Simona Frenda, Gavin Abercrombie, Valerio Basile, Alessandro Pedrani, Raffaella Panizzon, Alessandra Teresa Cignarella, Cristina Marco, and Davide Bernardi. 2024. Perspectivist approaches to natural language processing: A survey. Language Resources and Evaluation (2024). [link]
Simona Frenda, Alessandro Pedrani, Valerio Basile, Soda Marem Lo, Alessandra Teresa Cignarella, Raffaella Panizzon, Cristina Marco, Bianca Scarlini, Viviana Patti, Cristina Bosco, and Davide Bernardi. 2023. EPIC: Multi-Perspective Annotation of a Corpus of Irony. In Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), Anna Rogers, Jordan Boyd-Graber, and Naoaki Okazaki (Eds.). Association for Computational Linguistics, Toronto, Canada, 13844–13857. DOI: 10.18653/v1/2023.acl-long.774
Luca Gioacchini, Welton Santos, Barbara Lopes, Idilio Drago, Marco Mellia, Jussara M. Almeida, and Marcos André Gonçalves. 2024. Explainable Stacking Models based on Complementary Traffic Embeddings. In 2024 IEEE European Symposium on Security and Privacy Workshops (EuroS&PW). 261–272. DOI: 10.1109/EuroSPW61312.2024.00035
Chuan Guo, Geoff Pleiss, Yu Sun, and Kilian QWeinberger. 2017. On calibration of modern neural networks. In International conference on machine learning. PMLR, 1321–1330.
Sayar Ul Hassan, Jameel Ahamed, and Khaleel Ahmad. 2022. Analytics of machine learning-based algorithms for text classification. Sustainable Operations and Computers 3 (2022), 238–248. DOI: 10.1016/j.susoc.2022.03.001
Anh Ngo, Agri Candri, Teddy Ferdinan, Jan Kocon, and Wojciech Korczynski. 2022. StudEmo: A Non-aggregated Review Dataset for Personalized Emotion Recognition. In Proceedings of the 1st Workshop on Perspectivist Approaches to NLP @LREC2022, Gavin Abercrombie, Valerio Basile, Sara Tonelli, Verena Rieser, and Alexandra Uma (Eds.). European Language Resources Association, Marseille, France, 46–55. [link]
Marina Sokolova and Guy Lapalme. 2009. A systematic analysis of performance measures for classification tasks. Information Processing & Management 45, 4 (2009), 427–437. DOI: 10.1016/j.ipm.2009.03.002
L.M. Aroyo and C.A. Welty. 2015. Crowd Truth: Harnessing disagreement in crowdsourcing a relation extraction gold standard. In ACM Web Science 2013.
Valerio Basile, Michael Fell, Tommaso Fornaciari, Dirk Hovy, Silviu Paun, Barbara Plank, Massimo Poesio, and Alexandra Uma. 2021. We Need to Consider Disagreement in Evaluation. In Proceedings of the 1st Workshop on Benchmarking: Past, Present and Future, Kenneth Church, Mark Liberman, and Valia Kordoni (Eds.). Association for Computational Linguistics, Online, 15–21. DOI: 10.18653/v1/2021.bppf-1.3
Silvia Casola, Soda Marem Lo, Valerio Basile, Simona Frenda, Alessandra Teresa Cignarella, Viviana Patti, and Cristina Bosco. 2023. Confidence-based Ensembling of Perspective-aware Models. In Proceedings of the 2023 Conference on Empirical Methods in Natural Language Processing, Houda Bouamor, Juan Pino, and Kalika Bali (Eds.). Association for Computational Linguistics, Singapore, 3496–3507. DOI: 10.18653/v1/2023.emnlp-main.212
Eve Fleisig, Rediet Abebe, and Dan Klein. 2023. When the Majority is Wrong: Modeling Annotator Disagreement for Subjective Tasks. In Proceedings of the 2023 Conference on Empirical Methods in Natural Language Processing, Houda Bouamor, Juan Pino, and Kalika Bali (Eds.). Association for Computational Linguistics, Singapore, 6715–6726. DOI: 10.18653/v1/2023.emnlp-main.415
Simona Frenda, Gavin Abercrombie, Valerio Basile, Alessandro Pedrani, Raffaella Panizzon, Alessandra Teresa Cignarella, Cristina Marco, and Davide Bernardi. 2024. Perspectivist approaches to natural language processing: a survey. Language Resources and Evaluation (2024), 1–28.
Simona Frenda, Gavin Abercrombie, Valerio Basile, Alessandro Pedrani, Raffaella Panizzon, Alessandra Teresa Cignarella, Cristina Marco, and Davide Bernardi. 2024. Perspectivist approaches to natural language processing: A survey. Language Resources and Evaluation (2024). [link]
Simona Frenda, Alessandro Pedrani, Valerio Basile, Soda Marem Lo, Alessandra Teresa Cignarella, Raffaella Panizzon, Cristina Marco, Bianca Scarlini, Viviana Patti, Cristina Bosco, and Davide Bernardi. 2023. EPIC: Multi-Perspective Annotation of a Corpus of Irony. In Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), Anna Rogers, Jordan Boyd-Graber, and Naoaki Okazaki (Eds.). Association for Computational Linguistics, Toronto, Canada, 13844–13857. DOI: 10.18653/v1/2023.acl-long.774
Luca Gioacchini, Welton Santos, Barbara Lopes, Idilio Drago, Marco Mellia, Jussara M. Almeida, and Marcos André Gonçalves. 2024. Explainable Stacking Models based on Complementary Traffic Embeddings. In 2024 IEEE European Symposium on Security and Privacy Workshops (EuroS&PW). 261–272. DOI: 10.1109/EuroSPW61312.2024.00035
Chuan Guo, Geoff Pleiss, Yu Sun, and Kilian QWeinberger. 2017. On calibration of modern neural networks. In International conference on machine learning. PMLR, 1321–1330.
Sayar Ul Hassan, Jameel Ahamed, and Khaleel Ahmad. 2022. Analytics of machine learning-based algorithms for text classification. Sustainable Operations and Computers 3 (2022), 238–248. DOI: 10.1016/j.susoc.2022.03.001
Anh Ngo, Agri Candri, Teddy Ferdinan, Jan Kocon, and Wojciech Korczynski. 2022. StudEmo: A Non-aggregated Review Dataset for Personalized Emotion Recognition. In Proceedings of the 1st Workshop on Perspectivist Approaches to NLP @LREC2022, Gavin Abercrombie, Valerio Basile, Sara Tonelli, Verena Rieser, and Alexandra Uma (Eds.). European Language Resources Association, Marseille, France, 46–55. [link]
Marina Sokolova and Guy Lapalme. 2009. A systematic analysis of performance measures for classification tasks. Information Processing & Management 45, 4 (2009), 427–437. DOI: 10.1016/j.ipm.2009.03.002
Publicado
10/11/2025
Como Citar
JESUS, Samuel B.; BIANCO, Guilherme D.; JUNIOR, Wanderlei; BASILE, Valerio; GONÇALVES, Marcos André.
Aprimorando a Eficiência e a Equidade de uma Abordagem Perspectivista Para Detecção de Ironia. In: CONCURSO DE TRABALHOS DE INICIAÇÃO CIENTÍFICA - SIMPÓSIO BRASILEIRO DE SISTEMAS MULTIMÍDIA E WEB (WEBMEDIA), 31. , 2025, Rio de Janeiro/RJ.
Anais [...].
Porto Alegre: Sociedade Brasileira de Computação,
2025
.
p. 37-40.
ISSN 2596-1683.
DOI: https://doi.org/10.5753/webmedia_estendido.2025.16379.
