Filo-Transformer: Um modelo baseado em Grafo de Alinhamento de Árvores Filogenéticas e Transformers para Identificação de Rumores e Fake News

Acauan C. Ribeiro; Eduardo L. Feitosa; André Carvalho

doi:10.5753/sbseg.2025.10657

Acauan C. Ribeiro UFAM / UFRR
Eduardo L. Feitosa UFAM
André Carvalho UFAM

DOI: https://doi.org/10.5753/sbseg.2025.10657

Resumo

Este artigo apresenta Filo-Transformer, uma abordagem inovadora que une a semântica profunda de modelos Transformer com a análise evolutiva de Grafos de Alinhamento de Árvores Filogenéticas (TAGs) para detectar rumores e fake news. O modelo utiliza embeddings (SBERT/GPT) para representar o conteúdo e extrai atributos filogenéticos reais das cascatas de conversação do Twitter, como profundidade de cascata, fator de ramificação e proporção de usuários verificados. Um Feature Tokenizer Transformer (FT-Transformer) integra essas informações para classificação. Experimentos no dataset PHEME mostram que o Filo-Transformer supera modelos apenas semânticos em todas as métricas principais, com pesos de fusão aprendidos convergindo para 65% características filogenéticas e 35% semânticas, confirmando o valor dos padrões estruturais de propagação.

Referências

Brown, T. B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., Agarwal, S., Herbert-Voss, A., Krueger, G., Henighan, T., Child, R., Ramesh, A., Ziegler, D. M., Wu, J., Winter, C., Hesse, C., Chen, M., Sigler, E., Litwin, M., Gray, S., Chess, B., Clark, J., Berner, C., McCandlish, S., Radford, A., Sutskever, I., and Amodei, D. (2020). Language models are few-shot learners. arXiv preprint arXiv:2005.14165.

Devlin, J., Chang, M.-W., Lee, K., and Toutanova, K. (2019). Bert: Pre-training of deep bidirectional transformers for language understanding. In Proceedings of the 2019 conference of the North American chapter of the association for computational linguistics: human language technologies, volume 1 (long and short papers), pages 4171–4186.

Gorishniy, Y., Rubachev, I., Khrulkov, V., and Babenko, A. (2021). Revisiting deep learning models for tabular data. arXiv preprint arXiv:2106.11959.

Jang, S., Geng, T., Li, J.-Y. Q., Xia, R., Huang, C.-T., and Tang, J. (2018). A computational approach for examining the roots and spreading patterns of fake news: Evolution tree analysis. Computers in Human Behavior, 84:103–113.

Lazer, D. M., Baum, M. A., Grinberg, N., Friedland, L., Joseph, K., Hobbs, W., and Mattsson, C. (2018). The science of fake news. Science, 359(6380):1094–1096.

Li, Y., Chu, Z., Jia, C., and Zu, B. (2024). Samgat: structure-aware multilevel graph attention networks for automatic rumor detection. PeerJ Computer Science, 10:e2200.

Mikolov, T., Chen, K., Corrado, G., and Dean, J. (2013). Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781.

Peng, H., Cao, C., Shao, M., Liu, Y., Liu, X., and Deng, Z. (2024). Difference in rumor dissemination and debunking before and after the relaxation of covid-19 prevention and control measures in china: Infodemiology study. JMIR Public Health and Surveillance, 10.

Pennington, J., Socher, R., and Manning, C. D. (2014). Glove: Global vectors for word representation. In Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing (EMNLP), pages 1532–1543. Association for Computational Linguistics.

Qiu, X., Zhang, H., and Wang, J. (2022). Dynamic analysis and optimal control of rumor spreading model with recurrence and individual behaviors in heterogeneous networks. Entropy, 24(4):497.

Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., and Sutskever, I. (2019). Language models are unsupervised multitask learners. OpenAI Blog, 1(8).

Reimers, N. and Gurevych, I. (2019a). Sentence-bert: Sentence embeddings using siamese bert-networks. arXiv preprint arXiv:1908.10084.

Reimers, N. and Gurevych, I. (2019b). Sentence-BERT: Sentence embeddings using siamese BERT-networks. In Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP), pages 3982–3992.

Reimers, N. and Gurevych, I. (2019c). Sentence-bert: Sentence embeddings using siamese bert-networks. arXiv preprint arXiv:1908.10084.

Sharma, D. and Srivastava, A. (2024). Detecting rumors in social media using emotion based deep learning approach. PeerJ Computer Science, 10:e2202.

Shu, K., Sliva, A., Wang, S., Tang, J., and Liu, H. (2017). Fake news detection on social media: A data mining perspective. arXiv preprint arXiv:1708.01967.

Smith, S. A. et al. (2013). Tree alignment graphs: A formal framework for synthesizing rooted trees. In Proceedings of the National Academy of Sciences, volume 110, pages E117–E125.

Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A. N., Kaiser, L., and Polosukhin, I. (2017). Attention is all you need. In Advances in Neural Information Processing Systems, volume 30.

Vosoughi, S., Roy, D., and Aral, S. (2018a). The spread of true and false news online. Science, 359(6380):1146–1151.

Vosoughi, S., Roy, D., and Aral, S. (2018b). The spread of true and false news online. Science, 359(6380):1146–1151.

Wardle, C. (2017). Fake news. it’s complicated. [link]. Accessed: 2025-05-12.

Wasserman, S. and Faust, K. (1997). [book review] social network analysis, methods and applications. American Ethnologist, 24(1):219–220.

Wu, S., Deng, Y., Liu, J., Luo, X., and Sun, G. (2025). Rumor detection on social networks based on temporal tree transformer. PloS one, 20(4):e0320333.

Zubiaga, A., Liakata, M., Procter, R., Hoi, G. W. S., and Tolmie, P. (2016). Analysing how people orient to and spread rumours in social media. ACM Transactions on Intelligent Systems and Technology (TIST), 7(2):1–36.

Filo-Transformer: Um modelo baseado em Grafo de Alinhamento de Árvores Filogenéticas e Transformers para Identificação de Rumores e Fake News

Resumo

Referências

Artigos mais lidos do(s) mesmo(s) autor(es)