A new approach for measuring subjectivity in Brazilian news
Keywords:Bias, Machine Learning, Natural Language Processing, News, Subjectivity
With the advent of digital journalism, information democratization has become a reality since news articles are published as soon as the facts occur, and that they are accessible from any device connected to the internet. It is common sense the perception that some media outlets are more biased than others when it comes to the way of exposing the facts. However, automatic ways of measuring such biases is still an open research challenge. Under the assumption that journalistic texts must have an objective and impartial language, high levels of subjectivity in these texts may indicate bias. This paper proposes an initial analysis on the usage of subjectivity lexicons to characterize subjectivity in seven popular media outlets in Brazil. To better understand the obtained results, we carried out a correlation analysis between the levels of subjectivity, readability, and news popularity metrics. The adopted methods, along with the findings obtained from this research, may contribute to a better understanding of the linguistic characteristics of the news that readers consume daily in Brazil.
Al-Rawi, A. Gatekeeping fake news discourses on mainstream media versus social media. Social Science Computer Review 37 (6): 687–704, 2019.
Amorim, E., Cançado, M., and Veloso, A. Automated essay scoring in the presence of biased ratings. In Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long Papers). pp. 229–237, 2018.
Anderson, J. Lix and rix: Variations on a little-known readability index. Journal of Reading 26 (6): 490–496, 1983.
Bae, Y. and Lee, H. Sentiment analysis of twitter audiences: Measuring the positive or negative influence of popular twitterers. Journal of the American Society for Information Science and Technology 63 (12): 2521–2535, 2012.
Benveniste, E. Subjectivity in language. Problems in general linguistics vol. 1, pp. 223–230, 1971.
Chaturvedi, I., Cambria, E., Zhu, F., Qiu, L., and Ng, W. K. Multilingual subjectivity detection using deep multiple kernel learning. Proceedings of Knowledge Discovery and Data Mining, Sydney, 2015.
Coleman, M. and Liau, T. L. A computer readability formula designed for machine scoring. Journal of Applied Psychology 60 (2): 283, 1975.
Flaounas, I., Ali, O., Lansdall-Welfare, T., De Bie, T., Mosdell, N., Lewis, J., and Cristianini, N. Research methods in the age of digital journalism: Massive-scale automated analysis of news-content—topics, style and gender. Digital journalism 1 (1): 102–116, 2013.
Goldberg, B. Bias: A CBS Insider Exposes How the Media Distort the News. Regnery Publishing, 2001.
Hamborg, F., Donnay, K., and Gipp, B. Automated identification of media bias in news articles: an interdisciplinary literature review. International Journal on Digital Libraries 20 (4): 391–415, 2019.
Jha, V., Shreedevi, G., Shenoy, P. D., and Venugopal, K. Generating multilingual subjectivity resources using english language. Int. J. Comput. Appl 152 (9): 41–47, 2016.
Klare, G. R. A table for rapid determination of dale-chall readability scores. Educational Research Bulletin, 1952.
Lima, D. F., Sales, A., and Balby, L. An analysis of subjectivity in brazilian news. KDMILe, Fortaleza, Ceara, Brazil, pp. 81–88, 2019.
Mihalcea, R., Banea, C., and Wiebe, J. Learning Multilingual Subjective Language via Cross-Lingual Projections. Proceedings of ACL 1 (1): 14–21, 2007.
Mikolov, T., Le, Q. V., and Sutskever, I. Exploiting similarities among languages for machine translation. arXiv preprint arXiv:1309.4168 , 2013.
Moraes, S. M., Santos, A. L., Redecker, M., Machado, R. M., and Meneguzzi, F. R. Comparing approaches to subjectivity classification: A study on portuguese tweets, 2016.
Mullainathan, S. and Shleifer, A. Media bias. Tech. rep., National Bureau of Economic Research, 2002.
Nigam, S., Kumar, N., Mandal, N., Padma, B., and Rao, S. Real time ambient air quality status during diwali festival in central, india. Journal of Geoscience and Environment Protection vol. 4, pp. 162–172, 2016.
Sales, A., Balby, L., and Veloso, A. Media bias characterization in brazilian presidential elections. In Proceedings of the 30th ACM Conference on Hypertext and Social Media. HT ’19. ACM, New York, NY, USA, pp. 231–240, 2019.
Soontjens, K., Van Remoortere, A., and Walgrave, S. The hostile media: politicians’ perceptions of coverage bias. West European Politics, 2020.
Wiebe, J., Wilson, T., and Cardie, C. Annotating Expressions of Opinions and Emotions in Language. Empirical Methods in Natural Language Processing 1 (1): 164–210, 2005.
Wilson, T., Hoffmann, P., Somasundaran, S., Kessler, J., Wiebe, J., Choi, Y., Cardie, C., Riloff, E., and Patwardhan, S. Opinionfinder: A system for subjectivity analysis. In Proceedings of HLT/EMNLP 2005 Interactive Demonstrations. pp. 34–35, 2005.
Yaqub, U., Sharma, N., Pabreja, R., Chun, S., Atluri, V., and Vaidya, J. Analysis and visualization of subjectivity and polarity of twitter location data. In Proceedings of the 19th Annual International Conference on Digital Government Research: Governance in the Data Age. ACM, pp. 67, 2018.
Zar, J. H. Significance testing of the spearman rank correlation coefficient. Journal of the American Statistical Association 67 (339): 578–580, 1972.