FinBERT-PT-BR: Análise de Sentimentos de Textos em Português do Mercado Financeiro
Resumo
Este artigo contribui com um modelo de análise de sentimento para notícias financeiras em língua portuguesa usando a arquitetura de rede neural BERT. O modelo foi treinado em duas etapas: modelagem de linguagem e modelagem de sentimentos, com 1,4 milhão de textos e 500 textos rotulados, respectivamente. O modelo apresentou melhor desempenho do que os modelos atuais do estado da arte em diversas métricas e pode ser usado para construir índices de sentimento, estratégias de investimento e analisar dados macroeconômicos. O estudo demonstra o potencial do processamento de linguagem natural e transformers para finanças quantitativas.
Referências
Ardia, D., Chopard, B., and Boudt, K. (2015). Using twitter to model the eur/usd exchange rate. Economics Letters, 132:23–26.
Artstein, R. and Poesio, M. (2008). Inter-coder agreement for computational linguistics. Computational linguistics, 34(4):555–596.
Bollen, J., Mao, H., and Zeng, X. (2011). Twitter mood predicts the stock market.
Chen, S. F., Beeferman, D., and Rosenfeld, R. (1998). Evaluation metrics for language models.
de Souza, V. A., de Souza, F., and Meinerz, G. V. (2021). Análise de sentimento em tempo real de notícias do mercado de ações / real-time sentiment analysis of stock market news. Brazilian Journal of Development, 7(1):11084–11091.
Devlin, J., Chang, M.-W., Lee, K., and Toutanova, K. (2018). Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint ar-Xiv:1810.04805.
Hiew, J. Z. G., Huang, X., Mou, H., Li, D., Wu, Q., and Xu, Y. (2019). Bert-based financial sentiment index and lstm-based stock return predictability. arXiv preprint arXiv:1906.09024.
Januário, B. A., Carosia, A. E. d. O., Silva, A. E. A. d., and Coelho, G. P. (2022). Sentiment analysis applied to news from the brazilian stock market. IEEE Latin America Transactions, 20(3):512–518.
Junjie, Z. and Mengoni, P. (2020). Spot gold price prediction using financial news sentiment analysis. In 2020 IEEE/WIC/ACM International Joint Conference on Web Intelligence and Intelligent Agent Technology (WI-IAT), pages 758–763.
Kordonis, J., Symeonidis, S., and Arampatzis, A. (2016). Stock price forecasting via sentiment analysis on twitter. In Proceedings of the 20th Pan-Hellenic Conference on Informatics, PCI ’16, New York, NY, USA. Association for Computing Machinery.
Kraaijeveld, O. and De Smedt, J. (2020). The predictive power of public twitter sentiment for forecasting cryptocurrency prices. Journal of International Financial Markets, Institutions and Money, 65:101188.
Krippendorff, K. (2018). Content analysis: An introduction to its methodology. Sage publications.
Liu, B. (2012). Sentiment analysis and opinion mining. Synthesis Lectures on Human Language Technologies, 5(1):1–167.
Lo, A. W. (2004). The adaptive markets hypothesis. The Journal of Portfolio Management, 30(5):15–29.
Man, X., Luo, T., and Lin, J. (2019). Financial sentiment analysis(fsa): A survey. In 2019 IEEE International Conference on Industrial Cyber Physical Systems (ICPS), pages 617–622.
Manning, C. D. and Schütze, H. (1999). Foundations of Statistical Natural Language Processing. The MIT Press.
Medeiros, M. and Borges, V. (2019). Tweet sentiment analysis regarding the brazilian stock market. In Anais do VIII Brazilian Workshop on Social Network Analysis and Mining, pages 71–82, Porto Alegre, RS, Brasil. SBC.
Otabek, S. and Choi, J. (2022). Twitter attribute classification with q-learning on bitcoin price prediction. IEEE Access, 10:96136–96148.
Pagolu, V. S., Reddy, K. N., Panda, G., and Majhi, B. (2016). Sentiment analysis of twitter data for predicting stock market movements. In 2016 Int. Conf. on Signal Processing, Communication, Power and Embedded System, pages 1345–1350.
Pang, B. and Lee, L. (2004). A sentimental education: Sentiment analysis using subjectivity summarization based on minimum cuts. In Proceedings of the 42Nd Annual Meeting on Association for Computational Linguistics, ACL ’04, pages 271–278.
Pereira, J. G. (2019). Análise de sentimentos da população brasileira em relação a eleição presidencial de 2018 através da rede social twitter.
Silva, M. C. A. (2018). Percepções sobre corrupção durante as eleições presidenciais no brasil em 2018: uma análise baseada no twitter.
Souza, F., Nogueira, R., and Lotufo, R. (2020). Bertimbau: pretrained bert models for brazilian portuguese. In Brazilian Conference on Intelligent Systems, pages 403–417. Springer.
Sun, C., Qiu, X., Xu, Y., and Huang, X. (2019). How to fine-tune bert for text classification? In China national conference on Chinese computational linguistics, pages 194–206. Springer.
Tan, K. L., Lee, C. P., and Lim, K. M. (2023). A survey of sentiment analysis: Approaches, datasets, and future research. Applied Sciences, 13(7).
Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A. N., Kaiser, Ł., and Polosukhin, I. (2017). Attention is all you need. Advances in neural information processing systems, 30.
Xavier, F., OLENSCKI, J. R. W., ACOSTA, A. L., SALLUM, M. A. M., and SARAIVA, A. M. (2020). Análise de redes sociais como estratégia de apoio à vigilância em saúde durante a covid-19. Estudos Avançados, 34(99):261–282.