When Modernity Enhances Tradition and Specialization: A BERT Ensemble for Temporal Financial Argument Detection
Resumo
Temporal references in financial texts, such as earnings conference calls (ECCs), are essential for interpreting corporate discourse and guiding informed investment decisions. However, accurately identifying these references remains a challenge due to domain-specific language and vague temporal cues. In this work, we investigate whether combining distinct BERT-based models can improve performance on temporal financial argument detection. We evaluate an ensemble approach built upon three complementary models: BERT, FinBERT, and ModernBERT. Experiments on the FinArg-2 ECC dataset show that, while ModernBERT performs best individually, the ensemble of all three models — using a soft voting strategy — sets a new state of the art. These results highlight the potential of BERT-based ensembles for more accurate and robust temporal reasoning in financial NLP tasks.
Palavras-chave:
BERT, Ensembles, Finance, Temporal Argument Classification
Referências
Amorim, A., Assis, G., Oliveira, D., and Paes, A. Multiple Voices, Greater Power: A Strategy for Combining Language Models to Combat Hate Speech. In Anais do XII Symposium on Knowledge Discovery, Mining and Learning. SBC, Porto Alegre, RS, Brasil, pp. 121–128, 2024.
Araci, D. FinBert: Financial Sentiment Analysis with Pre-trained Language Models, 2019.
Chen, B.-J., Hsiao, W.-H., Wu, J.-Y., Wu, C.-Y., and Day, M.-Y. IMNTPU at the NTCIR-18 Finarg-2: Fine-Tuning and Prompt-Based Learning for Temporal Argument Detection and Claim Validity Assessment. In 18th NTCIR Conference on Evaluation of Information Access Technologies. NII Institutional Repository, 2025.
Chen, C.-C., Huang, H.-H., and Chen, H.-H. From opinion mining to financial argument mining. Springer Nature, 2021.
Chen, C.-C., Huang, H.-H., Shiue, Y.-T., and Chen, H.-H. Numeral Understanding in Financial Tweets for Fine-Grained Crowd-Based Forecasting. In 2018 IEEE/WIC/ACM International Conference on Web Intelligence (WI). IEEE, pp. 136–143, 2018.
Chen, C.-C., Lin, C.-Y., Chiu, C.-C., Huang, H.-H., Alhamzeh, A., Huang, Y.-L., Takamura, H., and Chen, H.-H. Overview of the NTCIR-18 Finarg-2 Task: Temporal Inference of Financial Arguments. In 18th NTCIR Conference on Evaluation of Information Access Technologies. NII Institutional Repository, 2025.
Crawford Camiciottoli, B. Persuasion in Earnings Calls: A Diachronic Pragmalinguistic Analysis. International Journal of Business Communication 55 (3): 275–292, 2017. (Original work published 2018).
Cunha, R., Chinonso, O., Campos, J., Timoney, B., Davis, B., Cozman, F., Pagano, A., and Castro Ferreira, T. Imaginary Numbers! Evaluating Numerical Referring Expressions by Neural End-to-End Surface Realization Systems. In Proceedings of the Fifth Workshop on Insights from Negative Results in NLP. ACL, Mexico City, Mexico, pp. 73–81, 2024.
Devlin, J., Chang, M., Lee, K., and Toutanova, K. BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding. In Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies. pp. 4171–4186, 2019.
Dutra, H., Martinho, L., Assis, G., Carvalho, J., and Paes, A. AIDAVANCE at the NTCIR-18 Finarg-2 Task: Making the Most of Small Language Models. In 18th NTCIR Conference on Evaluation of Information Access Technologies. NII Institutional Repository, 2025.
Efron, B. and Tibshirani, R. J. An Introduction to the Bootstrap. Chapman and Hall/CRC, 1994.
Erfina, A. and Le-Hong, P. FTRI at the NTCIR-18 Finarg-2 Task: Identify Temporal Reference in Earnings Conference Calls. In 18th NTCIR Conference on Evaluation of Information Access Technologies. NII Institutional Repository, 2025.
He, P., Liu, X., Gao, J., and Chen, W. DeBERTa: Decoding-enhanced BERT with disentangled attention. In Proceedings of the International Conference on Learning Representations (ICLR). OpenReview.net, 2021.
Lin, C.-Y., Chen, C.-C., Huang, H.-H., and Chen, H.-H. Argument-Based Sentiment Analysis on Forward-Looking Statements. In Findings of the Association for Computational Linguistics: ACL 2024, L.-W. Ku, A. Martins, and V. Srikumar (Eds.). Association for Computational Linguistics, Bangkok, Thailand, pp. 13804–13815, 2024.
Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., and Stoyanov, V. RoBERTa: A Robustly Optimized BERT Pretraining Approach. In Proceedings of the International Conference on Learning Representations (ICLR), 2020.
Nandam, S. S., Dasari, C. S. K. R., and Madasamy, A. K. SCaLAR IT at the NTCIR-18 Finarg-2: Temporal Inference of Financial Arguments. In 18th NTCIR Conference on Evaluation of Information Access Technologies. NII Institutional Repository, 2025.
OpenAI, T. Gpt-4o system card, 2024.
Paes, A., Vianna, D., and Rodrigues, J. Modelos de linguagem. In Processamento de Linguagem Natural: Conceitos, Técnicas e Aplicações em Português, 3 ed., H. M. Caseli and M. G. V. Nunes (Eds.). BPLN, Book chapter 17, 2024.
UzZaman, N., Llorens, H., Derczynski, L., Allen, J., Verhagen, M., and Pustejovsky, J. SemEval-2013 task 1: TempEval-3: Evaluating time expressions, events, and temporal relations. In Second Joint Conference on Lexical and Computational Semantics (*SEM), Volume 2: Proceedings of the Seventh International Workshop on Semantic Evaluation (SemEval 2013), S. Manandhar and D. Yuret (Eds.). Association for Computational Linguistics, Atlanta, Georgia, USA, pp. 1–9, 2013.
Warner, B., Chaffin, A., Clavié, B., Weller, O., Hallström, O., Taghadouini, S., Gallagher, A., Biswas, R., Ladhak, F., Aarsen, T., Cooper, N., Adams, G., Howard, J., and Poli, I. Smarter, better, faster, longer: A modern bidirectional encoder for fast, memory efficient, and long context finetuning and inference, 2024.
Wolf, T., Debut, L., Sanh, V., Chaumond, J., Delangue, C., Moi, A., Cistac, P., Rault, T., Louf, R., Funtowicz, M., Davison, J., Shleifer, S., von Platen, P., Ma, C., Jernite, Y., Plu, J., Xu, C., Le Scao, T., Gugger, S., Drame, M., Lhoest, Q., and Rush, A. Transformers: State-of-the-art natural language processing. In Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing: System Demonstrations, Q. Liu and D. Schlangen (Eds.). Association for Computational Linguistics, Online, pp. 38–45, 2020.
You, X.-Y., Liew, D. J., Yeh, W.-C., and Chang, Y.-C. TMUNLPG1 at the NTCIR-18 Finarg-2 Task. In 18th NTCIR Conference on Evaluation of Information Access Technologies. NII Institutional Repository, 2025.
Zhou, Z.-H. Ensemble Methods: Foundations and Algorithms. Chapman & Hall/CRC, 2012.
Araci, D. FinBert: Financial Sentiment Analysis with Pre-trained Language Models, 2019.
Chen, B.-J., Hsiao, W.-H., Wu, J.-Y., Wu, C.-Y., and Day, M.-Y. IMNTPU at the NTCIR-18 Finarg-2: Fine-Tuning and Prompt-Based Learning for Temporal Argument Detection and Claim Validity Assessment. In 18th NTCIR Conference on Evaluation of Information Access Technologies. NII Institutional Repository, 2025.
Chen, C.-C., Huang, H.-H., and Chen, H.-H. From opinion mining to financial argument mining. Springer Nature, 2021.
Chen, C.-C., Huang, H.-H., Shiue, Y.-T., and Chen, H.-H. Numeral Understanding in Financial Tweets for Fine-Grained Crowd-Based Forecasting. In 2018 IEEE/WIC/ACM International Conference on Web Intelligence (WI). IEEE, pp. 136–143, 2018.
Chen, C.-C., Lin, C.-Y., Chiu, C.-C., Huang, H.-H., Alhamzeh, A., Huang, Y.-L., Takamura, H., and Chen, H.-H. Overview of the NTCIR-18 Finarg-2 Task: Temporal Inference of Financial Arguments. In 18th NTCIR Conference on Evaluation of Information Access Technologies. NII Institutional Repository, 2025.
Crawford Camiciottoli, B. Persuasion in Earnings Calls: A Diachronic Pragmalinguistic Analysis. International Journal of Business Communication 55 (3): 275–292, 2017. (Original work published 2018).
Cunha, R., Chinonso, O., Campos, J., Timoney, B., Davis, B., Cozman, F., Pagano, A., and Castro Ferreira, T. Imaginary Numbers! Evaluating Numerical Referring Expressions by Neural End-to-End Surface Realization Systems. In Proceedings of the Fifth Workshop on Insights from Negative Results in NLP. ACL, Mexico City, Mexico, pp. 73–81, 2024.
Devlin, J., Chang, M., Lee, K., and Toutanova, K. BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding. In Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies. pp. 4171–4186, 2019.
Dutra, H., Martinho, L., Assis, G., Carvalho, J., and Paes, A. AIDAVANCE at the NTCIR-18 Finarg-2 Task: Making the Most of Small Language Models. In 18th NTCIR Conference on Evaluation of Information Access Technologies. NII Institutional Repository, 2025.
Efron, B. and Tibshirani, R. J. An Introduction to the Bootstrap. Chapman and Hall/CRC, 1994.
Erfina, A. and Le-Hong, P. FTRI at the NTCIR-18 Finarg-2 Task: Identify Temporal Reference in Earnings Conference Calls. In 18th NTCIR Conference on Evaluation of Information Access Technologies. NII Institutional Repository, 2025.
He, P., Liu, X., Gao, J., and Chen, W. DeBERTa: Decoding-enhanced BERT with disentangled attention. In Proceedings of the International Conference on Learning Representations (ICLR). OpenReview.net, 2021.
Lin, C.-Y., Chen, C.-C., Huang, H.-H., and Chen, H.-H. Argument-Based Sentiment Analysis on Forward-Looking Statements. In Findings of the Association for Computational Linguistics: ACL 2024, L.-W. Ku, A. Martins, and V. Srikumar (Eds.). Association for Computational Linguistics, Bangkok, Thailand, pp. 13804–13815, 2024.
Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., and Stoyanov, V. RoBERTa: A Robustly Optimized BERT Pretraining Approach. In Proceedings of the International Conference on Learning Representations (ICLR), 2020.
Nandam, S. S., Dasari, C. S. K. R., and Madasamy, A. K. SCaLAR IT at the NTCIR-18 Finarg-2: Temporal Inference of Financial Arguments. In 18th NTCIR Conference on Evaluation of Information Access Technologies. NII Institutional Repository, 2025.
OpenAI, T. Gpt-4o system card, 2024.
Paes, A., Vianna, D., and Rodrigues, J. Modelos de linguagem. In Processamento de Linguagem Natural: Conceitos, Técnicas e Aplicações em Português, 3 ed., H. M. Caseli and M. G. V. Nunes (Eds.). BPLN, Book chapter 17, 2024.
UzZaman, N., Llorens, H., Derczynski, L., Allen, J., Verhagen, M., and Pustejovsky, J. SemEval-2013 task 1: TempEval-3: Evaluating time expressions, events, and temporal relations. In Second Joint Conference on Lexical and Computational Semantics (*SEM), Volume 2: Proceedings of the Seventh International Workshop on Semantic Evaluation (SemEval 2013), S. Manandhar and D. Yuret (Eds.). Association for Computational Linguistics, Atlanta, Georgia, USA, pp. 1–9, 2013.
Warner, B., Chaffin, A., Clavié, B., Weller, O., Hallström, O., Taghadouini, S., Gallagher, A., Biswas, R., Ladhak, F., Aarsen, T., Cooper, N., Adams, G., Howard, J., and Poli, I. Smarter, better, faster, longer: A modern bidirectional encoder for fast, memory efficient, and long context finetuning and inference, 2024.
Wolf, T., Debut, L., Sanh, V., Chaumond, J., Delangue, C., Moi, A., Cistac, P., Rault, T., Louf, R., Funtowicz, M., Davison, J., Shleifer, S., von Platen, P., Ma, C., Jernite, Y., Plu, J., Xu, C., Le Scao, T., Gugger, S., Drame, M., Lhoest, Q., and Rush, A. Transformers: State-of-the-art natural language processing. In Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing: System Demonstrations, Q. Liu and D. Schlangen (Eds.). Association for Computational Linguistics, Online, pp. 38–45, 2020.
You, X.-Y., Liew, D. J., Yeh, W.-C., and Chang, Y.-C. TMUNLPG1 at the NTCIR-18 Finarg-2 Task. In 18th NTCIR Conference on Evaluation of Information Access Technologies. NII Institutional Repository, 2025.
Zhou, Z.-H. Ensemble Methods: Foundations and Algorithms. Chapman & Hall/CRC, 2012.
Publicado
29/09/2025
Como Citar
MARTINHO, Leonardo; DUTRA, Hugo; ASSIS, Gabriel; CARVALHO, Jonnathan; PAES, Aline.
When Modernity Enhances Tradition and Specialization: A BERT Ensemble for Temporal Financial Argument Detection. In: SYMPOSIUM ON KNOWLEDGE DISCOVERY, MINING AND LEARNING (KDMILE), 13. , 2025, Fortaleza/CE.
Anais [...].
Porto Alegre: Sociedade Brasileira de Computação,
2025
.
p. 17-24.
ISSN 2763-8944.
DOI: https://doi.org/10.5753/kdmile.2025.247741.
