Classificação do Diálogo Freireano em Mensagens de Fóruns de Discussão: Uma Análise de Desempenho do TF-IDF e o BERT para Sentenças

Francisco Romes da Silva Filho; Gabriel Antoine Louis Paillard; Rafael Augusto Ferreira do Carmo; Ernesto Trajano de Lima; Michel Sales Bonfim

doi:10.5753/sbie.2023.235298

Francisco Romes da Silva Filho UFC
Gabriel Antoine Louis Paillard UFC
Rafael Augusto Ferreira do Carmo UFC
Ernesto Trajano de Lima UFC
Michel Sales Bonfim UFC

DOI: https://doi.org/10.5753/sbie.2023.235298

Resumo

Os conceitos do diálogo freireano foram aplicados e organizados nos últimos anos de forma a ser possível delimitar características de mensagens de fóruns com a teoria. Esse trabalho propôs um classificador de texto binário para a presença da Valorização da Autônomia em mensagens de fóruns e ainda realizou uma comparação do desempenho de duas técnicas de codificação de texto. Os resultados indicaram com significância estatística que o Sentence-BERT foi superior ao método TF-IDF como método de codificação.

Referências

Al-Shabandar, R., Hussain, A. J., Liatsis, P., and Keight, R. (2018). Analyzing learners behavior in moocs: An examination of performance and motivation using a data-driven approach. IEEE Access, 6:73669–73685.

Barbosa, A., Ferreira, M., Ferreira Mello, R., Dueire Lins, R., and Gasevic, D. (2021). The impact of automatic text translation on classification of online discussions for social and cognitive presences. In LAK21: 11th International Learning Analytics and Knowledge Conference, LAK21, page 77–87, New York, NY, USA. Association for Computing Machinery.

Burkov, A. and Lutz, M. (2019). The Hundred-Page Machine Learning Book en français.

Chistol, M. (2020). A comparative study of parametric versus non-parametric text classification algorithms. In 2020 International Conference on Development and Application Systems (DAS), pages 208–213.

David, P., Lima, E., and Mendes, F. (2015). Uma ontologia de domínio para a análise do diálogo freireano em fóruns de discussão da educação a distância. Braz. Symp. on Computers in Education (SBIE), 26(1):1082.

David, P. B. (2010). Interações contingentes em ambientes virtuais de aprendizagem. PhD thesis, Universidade Federal do Ceará, Faculdade de Educação, Programa de Pósgraduação em Educação Brasileira.

Dietterich, T. G. (1998). Approximate statistical tests for comparing supervised classification learning algorithms. Neural Computation, 10(7):1895–1923.

Ferreira, M., Rolim, V., Mello, R. F., Lins, R. D., Chen, G., and Gašević, D. (2020). Towards automatic content analysis of social presence in transcripts of online discussions. LAK ’20, New York, NY, USA. Association for Computing Machinery.

Gomede, E., Miranda de Barros, R., and de Souza Mendes, L. (2020). Use of deep multi-target prediction to identify learning styles. Applied Sciences, 10(5).

Hlosta, M., Zdrahal, Z., and Zendulka, J. (2017). Ouroboros: Early identification of atrisk students without models based on legacy data. LAK ’17, New York, NY, USA. Association for Computing Machinery.

L’Heureux, A., Grolinger, K., Elyamany, H. F., and Capretz, M. A. M. (2017). Machine learning with big data: Challenges and approaches. IEEE Access, 5:7776–7797.

Mishra, A. and Vishwakarma, S. (2015). Analysis of tf-idf model and its variant for document retrieval. In 2015 International Conference on Computational Intelligence and Communication Networks (CICN), pages 772–776.

Oliveira, M. M. d., Barwaldt, R., Pias, M. R., and Espíndola, D. B. (2019). Understanding the student dropout in distance learning. In 2019 IEEE Frontiers in Education Conference (FIE), pages 1–7.

Raschka, S. (2018). Mlxtend: Providing machine learning and data science utilities and extensions to python’s scientific computing stack. The Journal of Open Source Software, 3(24).

Raschka, S. and Mirjalili, V. (2019). Python Machine Learning. Packt Publishing, Birmingham, UK, 3 edition.

Reimers, N. and Gurevych, I. (2019a). Sentence-bert: Sentence embeddings using siamese bert-networks. CoRR, abs/1908.10084.

Reimers, N. and Gurevych, I. (2019b). Sentence-bert: Sentence embeddings using siamese bert-networks. In Proc. of the 2019 Conf. on Empirical Methods in Nat. Lang. Processing. Association for Computational Linguistics.

Sammut, C. and Webb, G. I., editors (2011). Encyclopedia of Machine Learning. Springer Reference. Springer, New York.

Singh, S. and Lal, S. P. (2013). Using feature selection and association rule mining to evaluate digital courseware. In 2013 Eleventh International Conference on ICT and Knowledge Engineering, pages 1–7.

Wu, J.-Y., Hsiao, Y.-C., and Nian, M.-W. (2020). Using supervised machine learning on large-scale online forums to classify course-related facebook messages in predicting learning achievement within the personal learning environment. Interactive Learning Environments, 28(1):65–80.

Yang, Y. (2017). Research and realization of internet public opinion analysis based on improved tf - idf algorithm. In 2017 16th International Symposium on Distributed Computing and Applications to Business, Engineering and Science (DCABES), pages 80–83.

Yoo, J. E., Rho, M., and Lee, Y. (2022). Online students’ learning behaviors and academic success: An analysis of lms log data from flipped classrooms via regularization. IEEE Access, 10:10740–10753.