Summarization of Educational Videos with Transformers Networks

  • Leandro Massetti Ribeiro Oliveira UFMA
  • Li Chang Shuen UFMA
  • Allan Kássio Beckman Soares da Cruz UFMA
  • Carlos de Salles Soares UFMA


This paper presents an approach to summarize educational videos using Deep Learning Transformers models. The approach focuses on educational content by summarizing captions and using the text results to summarize the videos. Tests were conducted using the EDUVSUM dataset, which improved upon the original paper’s results, achieving an accuracy of 26.53% in a multi-class problem, with a mean absolute error of 1.49 per video frame and 1.45 per video segment. Transformer techniques for automatic text summarization have proven effective in creating multimedia learning objects. The results suggest that these techniques can generate more efficient and high-quality digital educational resources, reducing the time and effort required for their creation.

Palavras-chave: Machine learning, transformers, e-learning, video summarization


OLIVEIRA, Leandro Massetti Ribeiro; SHUEN, Li Chang; DA CRUZ, Allan Kássio Beckman Soares; SOARES, Carlos de Salles. Summarization of Educational Videos with Transformers Networks. In: SIMPÓSIO BRASILEIRO DE SISTEMAS MULTIMÍDIA E WEB (WEBMEDIA), 29. , 2023, Ribeirão Preto/SP. Anais [...]. Porto Alegre: Sociedade Brasileira de Computação, 2023 . p. 137–143.

