skip to main content
10.1145/3539637.3556892acmconferencesArticle/Chapter ViewAbstractPublication PageswebmediaConference Proceedingsconference-collections
research-article

Should We Translate? Evaluating Toxicity in Online Comments when Translating from Portuguese to English

Authors Info & Claims
Published:07 November 2022Publication History

ABSTRACT

Social media and online discussion platforms suffer from the prevalence of uncivil behavior, such as harassment and abuse, seeking to curb toxic comments. There are several approaches to classifying toxic comments automatically. Some of them have more resources and are more advanced in English, thus, stimulating the task of translating the text from a specific language to English. While researchers have shown evidence that this practice is indicated for certain tasks, such as sentiment analysis, little is known in the context of toxicity identification. In this research, we assess the performance of a freely available model for toxic language detection in online comments called Perspective API, widely adopted by some famous news media sites to identify different toxicity classes in online comments. For that, we obtained comments in Portuguese from two Brazilian news media websites during a politically polarized situation as a use case. Then, this dataset was translated to English and compared to four baseline datasets, two composed of highly toxic comments, one in Portuguese and other in English, and two composed of neutral comments, also one in Portuguese and other in English – all of them in its original language, not translated. Finally, human-annotated comments from the news comments dataset were analyzed to assess the scores provided by the Perspective API for the original and the translated versions. Results indicate that keeping the texts in their original language is preferable, even in comparing different languages. Nevertheless, if the translated version is strictly necessary, ways of dealing with the situation were suggested to preserve as much information as possible from the original version.

References

  1. Hind Almerekhi, Haewoon Kwak, Joni Salminen, and Bernard J. Jansen. 2020. Are These Comments Triggering? Predicting Triggers of Toxicity in Online Discussions. Association for Computing Machinery, New York, NY, USA, 3033–3040. https://doi.org/10.1145/3366423.3380074Google ScholarGoogle ScholarDigital LibraryDigital Library
  2. Matheus Araújo, Adriano Pereira, and Fabrício Benevenuto. 2020. A comparative study of machine translation for multilingual sentence-level sentiment analysis. Information Sciences 512(2020), 1078–1102.Google ScholarGoogle ScholarDigital LibraryDigital Library
  3. Pedro P Balage Filho, Thiago Alexandre Salgueiro Pardo, and Sandra Maria Aluisio. 2013. An evaluation of the Brazilian Portuguese LIWC Dictionary for sentiment analysis. In Proceedings of the 9th Brazilian Symposium in Information and Human Language Technology. Sociedade Brasileira de Computação, Fortaleza, CE, Brazil, 215–219.Google ScholarGoogle Scholar
  4. Zhenpeng Chen, Sheng Shen, Ziniu Hu, Xuan Lu, Qiaozhu Mei, and Xuanzhe Liu. 2019. Emoji-Powered Representation Learning for Cross-Lingual Sentiment Classification. In The World Wide Web Conference (San Francisco, CA, USA) (WWW ’19). Association for Computing Machinery, New York, NY, USA, 251–262. https://doi.org/10.1145/3308558.3313600Google ScholarGoogle ScholarDigital LibraryDigital Library
  5. Erik De Vries, Martijn Schoonvelde, and Gijs Schumacher. 2018. No longer lost in translation: Evidence that Google Translate works for comparative bag-of-words text applications. Political Analysis 26, 4 (2018), 417–430.Google ScholarGoogle ScholarCross RefCross Ref
  6. Joseph L Fleiss, Bruce Levin, Myunghee Cho Paik, 1981. The measurement of interrater agreement. Statistical methods for rates and proportions 2, 212-236(1981), 22–23.Google ScholarGoogle Scholar
  7. Paula Fortuna, Juan Soler, and Leo Wanner. 2020. Toxic, Hateful, Offensive or Abusive? What Are We Really Classifying? An Empirical Analysis of Hate Speech Datasets. In Proceedings of the 12th Language Resources and Evaluation Conference. European Language Resources Association, Marseille, France, 6786–6794. https://aclanthology.org/2020.lrec-1.838Google ScholarGoogle Scholar
  8. Spiros V. Georgakopoulos, Sotiris K. Tasoulis, Aristidis G. Vrahatis, and Vassilis P. Plagianakos. 2018. Convolutional Neural Networks for Toxic Comment Classification. In Proceedings of the 10th Hellenic Conference on Artificial Intelligence (Patras, Greece) (SETN ’18). Association for Computing Machinery, New York, NY, USA, Article 35, 6 pages. https://doi.org/10.1145/3200947.3208069Google ScholarGoogle ScholarDigital LibraryDigital Library
  9. Samuel S. Guimarães, Julio C. S. Reis, Filipe N. Ribeiro, and Fabrício Benevenuto. 2020. Characterizing Toxicity on Facebook Comments in Brazil. In Proceedings of the Brazilian Symposium on Multimedia and the Web (São Luís, Brazil) (WebMedia ’20). Association for Computing Machinery, New York, NY, USA, 253–260. https://doi.org/10.1145/3428658.3430974Google ScholarGoogle ScholarDigital LibraryDigital Library
  10. Hatebase. 2022. Hatebase. https://hatebase.org Acessed May 31, 2022.Google ScholarGoogle Scholar
  11. Hossein Hosseini, Sreeram Kannan, Baosen Zhang, and Radha Poovendran. 2017. Deceiving google’s perspective api built for detecting toxic comments. arXiv preprint arXiv:1702.08138(2017).Google ScholarGoogle Scholar
  12. Edwin Jain, Stephan Brown, Jeffery Chen, Erin Neaton, Mohammad Baidas, Ziqian Dong, Huanying Gu, and Nabi Sertac Artan. 2018. Adversarial Text Generation for Google’s Perspective API. 2018 International Conference on Computational Science and Computational Intelligence (CSCI)(2018), 1136–1141.Google ScholarGoogle Scholar
  13. Google Jigsaw. 2022. Perspective API. https://perspectiveapi.com Acessed May 31, 2022.Google ScholarGoogle Scholar
  14. Jordan K Kobellarz, Milos Brocic, Alexandre R Graeml, Daniel Silver, and Thiago H Silva. 2021. Popping the Bubble May Not be Enough: News Media Role in Online Political Polarization. https://doi.org/10.48550/ARXIV.2109.08906Google ScholarGoogle Scholar
  15. Jordan K. Kobellarz, Alexandre R. Graeml, Michelle Reddy, and Thiago H. Silva. 2019. Parrot Talk: Retweeting among Twitter Users during the 2018 Brazilian Presidential Election. In Proceedings of the 25th Brazillian Symposium on Multimedia and the Web (Rio de Janeiro, Brazil) (WebMedia ’19). Association for Computing Machinery, New York, NY, USA, 221–228. https://doi.org/10.1145/3323503.3349559Google ScholarGoogle ScholarDigital LibraryDigital Library
  16. Srijan Kumar, William L. Hamilton, Jure Leskovec, and Dan Jurafsky. 2018. Community Interaction and Conflict on the Web. In Proceedings of the 2018 World Wide Web Conference(Lyon, France) (WWW ’18). International World Wide Web Conferences Steering Committee, Republic and Canton of Geneva, CHE, 933–943. https://doi.org/10.1145/3178876.3186141Google ScholarGoogle ScholarDigital LibraryDigital Library
  17. Alyssa Lees, Vinh Q Tran, Yi Tay, Jeffrey Sorensen, Jai Gupta, Donald Metzler, and Lucy Vasserman. 2022. A new generation of perspective api: Efficient multilingual character-level transformers. arXiv preprint arXiv:2202.11176(2022).Google ScholarGoogle Scholar
  18. João A. Leite, Diego F. Silva, Kalina Bontcheva, and Carolina Scarton. 2020. Toxic Language Detection in Social Media for Brazilian Portuguese: New Dataset and Multilingual Analysis. https://doi.org/10.48550/ARXIV.2010.04543Google ScholarGoogle Scholar
  19. Christopher Lucas, Richard A Nielsen, Margaret E Roberts, Brandon M Stewart, Alex Storer, and Dustin Tingley. 2015. Computer-assisted text analysis for comparative politics. Political Analysis 23, 2 (2015), 254–277.Google ScholarGoogle ScholarCross RefCross Ref
  20. Ji Ho Park and Pascale Fung. 2017. One-step and Two-step Classification for Abusive Language Detection on Twitter. (Aug. 2017), 41–45. https://doi.org/10.18653/v1/W17-3006Google ScholarGoogle Scholar
  21. James W Pennebaker, Martha E Francis, and Roger J Booth. 2001. Linguistic inquiry and word count: LIWC 2001. Mahway: Lawrence Erlbaum Associates 71 (2001).Google ScholarGoogle Scholar
  22. Denilson Alves Pereira. 2021. A survey of sentiment analysis in the Portuguese language. Artificial Intelligence Review 54, 2 (2021), 1087–1115.Google ScholarGoogle ScholarDigital LibraryDigital Library
  23. Livy Real, Marcio Oshiro, and Alexandre Mafra. 2019. B2W-Reviews01-An open product reviews corpus. In the Proceedings of the XII Symposium in Information and Human Language Technology. 200–208.Google ScholarGoogle Scholar
  24. Bernhard Rieder and Yarden Skop. 2021. The fabrics of machine moderation: Studying the technical, normative, and organizational structure of Perspective API. Big Data & Society 8, 2 (2021).Google ScholarGoogle Scholar
  25. Joni Salminen, Sercan Sengün, Juan Corporan, Soon-gyo Jung, and Bernard J. Jansen. 2020. Topic-driven toxicity: Exploring the relationship between online toxicity and news topics. PLOS ONE 15, 2 (02 2020), 1–24. https://doi.org/10.1371/journal.pone.0228723Google ScholarGoogle Scholar
  26. Gustavo Santos, Vinicius F S Mota, Fabrício Benevenuto, and Thiago H Silva. 2020. Neutrality may matter: sentiment analysis in reviews of Airbnb, Booking, and Couchsurfing in Brazil and USA. Social Network Analysis and Mining 10, 1 (2020), 45. https://doi.org/10.1007/s13278-020-00656-5Google ScholarGoogle ScholarCross RefCross Ref
  27. Saurabh Srivastava, Prerna Khurana, and Vartika Tewari. 2018. Identifying Aggression and Toxicity in Comments using Capsule Network. In Proceedings of the First Workshop on Trolling, Aggression and Cyberbullying (TRAC-2018). Association for Computational Linguistics, Santa Fe, New Mexico, USA, 98–105. https://aclanthology.org/W18-4412Google ScholarGoogle Scholar
  28. William Warner and Julia Hirschberg. 2012. Detecting Hate Speech on the World Wide Web. In Proceedings of the Second Workshop on Language in Social Media. Association for Computational Linguistics, Montréal, Canada, 19–26. https://aclanthology.org/W12-2103Google ScholarGoogle ScholarDigital LibraryDigital Library
  29. Dawei Yin, Zhenzhen Xue, Liangjie Hong, Brian D Davison, April Kontostathis, and Lynne Edwards. 2009. Detection of harassment on web 2.0. Proceedings of the Content Analysis in the WEB 2, 1–7.Google ScholarGoogle Scholar

Index Terms

  1. Should We Translate? Evaluating Toxicity in Online Comments when Translating from Portuguese to English

            Recommendations

            Comments

            Login options

            Check if you have access through your login credentials or your institution to get full access on this article.

            Sign in
            • Published in

              cover image ACM Conferences
              WebMedia '22: Proceedings of the Brazilian Symposium on Multimedia and the Web
              November 2022
              389 pages
              ISBN:9781450394093
              DOI:10.1145/3539637

              Copyright © 2022 ACM

              Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

              Publisher

              Association for Computing Machinery

              New York, NY, United States

              Publication History

              • Published: 7 November 2022

              Permissions

              Request permissions about this article.

              Request Permissions

              Check for updates

              Qualifiers

              • research-article
              • Research
              • Refereed limited

              Acceptance Rates

              Overall Acceptance Rate270of873submissions,31%
            • Article Metrics

              • Downloads (Last 12 months)40
              • Downloads (Last 6 weeks)5

              Other Metrics

            PDF Format

            View or Download as a PDF file.

            PDF

            eReader

            View online with eReader.

            eReader

            HTML Format

            View this article in HTML Format .

            View HTML Format