An Analysis of Subjectivity in Brazilian News

  • D. F. Lima Universidade Federal de Campina Grande
  • A. S. C. Melo Universidade Federal de Campina Grande
  • L. B. Marinho Universidade Federal de Campina Grande


With the advent of digital journalism, the democratization of information has become a reality, since news articles are published as soon as the facts occur and are accessible from any device connected to the internet. It is common sense the perception that some newspapers are more biased than others when it comes to the way of exposing the facts. However, automatic ways of measuring such biases is still an open research challenge. Under the premise that journalistic texts must have objective and unbiased language, news with high levels of subjectivity may indicate bias. In this paper, we propose to use subjectivity lexicons to characterize subjectivity in five news portals that are popular in Brazil. To better understand the results found, we performed a correlation analysis between the levels of subjectivity found and readability and news popularity metrics. We believe that the methods we used along with our findings contribute to a better understanding of the linguistic characteristics of the news we consume daily.

Palavras-chave: Bias, Machine Learning, Natural Language Processing, News, Subjectivity


Amorim, E., Cançado, M., and Veloso, A. Automated essay scoring in the presence of biased ratings. In Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long Papers). pp. 229–237, 2018.

Anderson, J. Lix and rix: Variations on a little-known readability index. Journal of Reading 26 (6): 490–496, 1983.

Bae, Y. and Lee, H. Sentiment analysis of twitter audiences: Measuring the positive or negative influence of popular twitterers. Journal of the American Society for Information Science and Technology 63 (12): 2521–2535, 2012.

Benveniste, E. Subjectivity in language. Problems in general linguistics vol. 1, pp. 223–230, 1971.

Chaturvedi, I., Cambria, E., Zhu, F., Qiu, L., and Ng, W. K. Multilingual subjectivity detection using deep
multiple kernel learning. Proceedings of Knowledge Discovery and Data Mining, Sydney, 2015.

Coleman, M. and Liau, T. L. A computer readability formula designed for machine scoring. Journal of Applied
Psychology 60 (2): 283, 1975.

Flaounas, I., Ali, O., Lansdall-Welfare, T., De Bie, T., Mosdell, N., Lewis, J., and Cristianini, N. Research
methods in the age of digital journalism: Massive-scale automated analysis of news-content—topics, style and gender. Digital journalism 1 (1): 102–116, 2013.

Goldberg, B. Bias: A CBS Insider Exposes How the Media Distort the News. Regnery Publishing, 2001.

Jha, V., Shreedevi, G., Shenoy, P. D., and Venugopal, K. Generating multilingual subjectivity resources using
english language. Int. J. Comput. Appl 152 (9): 41–47, 2016.

Klare, G. R. A table for rapid determination of dale-chall readability scores. Educational Research Bulletin, 1952.
Kusner, M., Sun, Y., Kolkin, N., and Weinberger, K. From word embeddings to document distances. In
International Conference on Machine Learning. pp. 957–966, 2015.

Mihalcea, R., Banea, C., and Wiebe, J. Learning Multilingual Subjective Language via Cross-Lingual Projections.
Proceedings of ACL 1 (1): 14–21, 2007.

Mikolov, T., Le, Q. V., and Sutskever, I. Exploiting similarities among languages for machine translation. arXiv
preprint arXiv:1309.4168 , 2013.

Moraes, S. M., Santos, A. L., Redecker, M., Machado, R. M., and Meneguzzi, F. R. Comparing approaches
to subjectivity classification: A study on portuguese tweets, 2016.

Nigam, S., Kumar, N., Mandal, N., Padma, B., and Rao, S. Real time ambient air quality status during diwali
festival in central, india. Journal of Geoscience and Environment Protection vol. 4, pp. 162–172, 2016.

Sales, A., Balby, L., and Veloso, A. Media bias characterization in brazilian presidential elections. In Proceedings
of the 30th ACM Conference on Hypertext and Social Media. HT ’19. ACM, New York, NY, USA, pp. 231–240, 2019.

Wiebe, J., Wilson, T., and Cardie, C. Annotating Expressions of Opinions and Emotions in Language. Empirical
Methods in Natural Language Processing 1 (1): 164–210, 2005.

Wilson, T., Hoffmann, P., Somasundaran, S., Kessler, J., Wiebe, J., Choi, Y., Cardie, C., Riloff, E., and
Patwardhan, S. Opinionfinder: A system for subjectivity analysis. In Proceedings of HLT/EMNLP 2005 Interactive
Demonstrations. pp. 34–35, 2005.

Yaqub, U., Sharma, N., Pabreja, R., Chun, S., Atluri, V., and Vaidya, J. Analysis and visualization of subjectivity
and polarity of twitter location data. In Proceedings of the 19th Annual International Conference on Digital
Government Research: Governance in the Data Age. ACM, pp. 67, 2018.

Zar, J. H. Significance testing of the spearman rank correlation coefficient. Journal of the American Statistical
Association 67 (339): 578–580, 1972.
Como Citar

Selecione um Formato
LIMA, D. F.; MELO, A. S. C.; MARINHO, L. B.. An Analysis of Subjectivity in Brazilian News. In: SYMPOSIUM ON KNOWLEDGE DISCOVERY, MINING AND LEARNING (KDMILE) , 2019, Fortaleza. Anais do VII Symposium on Knowledge Discovery, Mining and Learning. Porto Alegre: Sociedade Brasileira de Computação, nov. 2019 . p. 81-88. DOI: