Beyond Systematic Bias: Investigating Gender Differences in Portuguese Text Classification Annotation Patterns

  • Alexander Feitosa Centro Federal de Educação Tecnológica Celso Suckow da Fonseca (CEFET/RJ)
  • Érica Carneiro Centro Federal de Educação Tecnológica Celso Suckow da Fonseca (CEFET/RJ)
  • Gustavo Guedes Centro Federal de Educação Tecnológica Celso Suckow da Fonseca (CEFET/RJ) https://orcid.org/0000-0001-8593-1506

Resumo


This study explores how gendered annotation patterns influence sentiment classification in Brazilian Portuguese and whether these patterns are preserved—or amplified—by machine learning models. A corpus of 1,465 diary-style sentences was independently labeled by two gender-balanced annotator groups. Despite a high final agreement, as indicated by Cohen’s κ = 0.8177, statistical analyses revealed divergent annotation behaviors: male annotators exhibited higher internal consistency and lower entropy, while female annotators showed greater variability and a higher proportion of neutral labels. Classifiers trained separately on each group’s labels (SVM, LR, NB, RF, DT) reproduced these divergences to varying degrees. Notably, alignment between gendered models dropped to κ = 0.3838 (Decision Tree) and peaked at κ = 0.6952 (Logistic Regression), indicating that learning behaviors may differ substantially based on the annotation source. These findings reinforce that annotation is a socially based process. Gendered interpretive divergences can propagate through learning pipelines, shaping model behavior in ways that reflect and potentially amplify gender bias, often going unnoticed without annotation-aware evaluation strategies. Ethical approval was granted under protocol CAAE 82267824.8.0000.5289.
Palavras-chave: annotation behavior, gender bias, sentiment classification, supervised learning, fairness in NLP

Referências

Alan, B. and Duncan, C. Quantitative data analysis with spss for windows: A guide for social scientists, 1997.

Azevedo, G. d., Pettine, G., Feder, F., Portugal, G., Schocair Mendes, C. O., Castaneda Ribeiro, R., Mauro, R. C., Paschoal Júnior, F., and Guedes, G. Nat: Towards an emotional agent. In 2021 16th Iberian Conference on Information Systems and Technologies (CISTI). IEEE, Chaves, Portugal, pp. 1–4, 2021.

Blodgett, S. L., Barocas, S., Daumé III, H., and Wallach, H. Language (technology) is power: A critical survey of "bias" in nlp. In Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics. ACL, Association for Computational Linguistics, Online, pp. 5454–5476, 2020.

Davani, A., Díaz, M., and Prabhakaran, V. Dealing with disagreements: Looking beyond the majority vote in subjective annotations. Transactions of the Association for Computational Linguistics vol. 10, pp. 92–110, 2022.

Geva, M., Goldberg, Y., and Berant, J. Are we modeling the task or the annotator? an investigation of annotator bias in natural language understanding datasets. In Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP). Association for Computational Linguistics, Hong Kong, China, pp. 1161–1166, 2019.

Havens, V. and Hedges, M. Uncertainty and inclusivity in gender bias annotation. In Proceedings of the 1st Workshop on Gender Bias in Natural Language Processing. Association for Computational Linguistics, Abu Dhabi, UAE, pp. 25–31, 2022.

Kowsari, K., Meimandi, K. J., Heidarysafa, M., Mendu, S., Barnes, L., and Brown, D. Text classification algorithms: A survey. Information 10 (4): 150, 2019.

Landis, J. R. and Koch, G. G. The measurement of observer agreement for categorical data. Biometrics 33 (1): 159–174, 1977.

Lim, S. S., Udomcharoenchaikit, C., Limkonchotiwat, P., Chuangsuwanich, E., and Nutanong, S. Identifying and mitigating annotation bias in natural language understanding using causal mediation analysis. In Findings of the Association for Computational Linguistics: ACL 2024. Association for Computational Linguistics, Bangkok, Thailand, pp. 11548–11563, 2024.

Mehrabi, N., Morstatter, F., Saxena, N., Lerman, K., and Galstyan, A. A survey on bias and fairness in machine learning. ACM Computing Surveys (CSUR) 54 (6): 1–35, 2021.

Minatel, D., da Silva, A. C. M., dos Santos, N. R., Curi, M., Marcacini, R. M., and de Andrade Lopes, A. Data stratification analysis on the propagation of discriminatory effects in binary classification. In Anais do 11º Symposium on Knowledge Discovery, Mining and Learning (KDMiLe). Sociedade Brasileira de Computação, Belo Horizonte, MG, pp. 73–80, 2023.

Paullada, A., Raji, I. D., Bender, E. M., Denton, E., and Hanna, A. Data and its (dis)contents: A survey of dataset development and use in machine learning research. Patterns 2 (11): 100336, 2021.

Raji, I. D., Bender, E. M., Paullada, A., Denton, E., and Hanna, A. Ai and the everything in the whole wide world benchmark. In Proceedings of the NeurIPS 2021 Datasets and Benchmarks Track. NeurIPS, Virtual Conference, pp. 1–10, 2021.

Schwindt, L. C. Predizibilidade da marcação de gênero em substantivos no português brasileiro. Gênero e língua (gem): formas e usos vol. 1, pp. 279–294, 2020.

Shah, D. S., Schwartz, H. A., and Hovy, D. Predictive biases in natural language processing models: A conceptual framework and overview. In Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics. Association for Computational Linguistics, Online, pp. 5248–5264, 2020.

Silva, M. O. and Moro, M. M. Nlppipeline for gender bias detection in portuguese literature. In Anais do Seminário Integrado de Software e Hardware (SEMISH). SBC, Sociedade Brasileira de Computação, Brasília, Brazil, pp. 1–10, 2024.

Stańczak, K. and Augenstein, I. A survey on gender bias in natural language processing, 2021.

Sun, T., Gaut, A., Tang, S., Huang, Y., Sap, M., Clark, E., Friedman, D., Choi, Y., Smith, N. A., Zettlemoyer, L., et al. Mitigating gender bias in natural language processing: Literature review. In Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics. Association for Computational Linguistics, Florence, Italy, pp. 1630–1640, 2019.

Zhao, J., Wang, T., Yatskar, M., Ordonez, V., and Chang, K.-W. Men also like shopping: Reducing gender bias amplification using corpus-level constraints. In Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing. Association for Computational Linguistics, Copenhagen, Denmark, pp. 2979–2989, 2017.
Publicado
29/09/2025
FEITOSA, Alexander; CARNEIRO, Érica; GUEDES, Gustavo. Beyond Systematic Bias: Investigating Gender Differences in Portuguese Text Classification Annotation Patterns. In: SYMPOSIUM ON KNOWLEDGE DISCOVERY, MINING AND LEARNING (KDMILE), 13. , 2025, Fortaleza/CE. Anais [...]. Porto Alegre: Sociedade Brasileira de Computação, 2025 . p. 129-136. ISSN 2763-8944. DOI: https://doi.org/10.5753/kdmile.2025.247589.