Frame-Based Semantic Representation and Similarity Analysis in Audio Description Scripts
Abstract
We present a semantic similarity analysis between two versions of audio description scripts for a Brazilian short film, based on Frame Semantics and the FrameNet Brasil annotation model. Our study applies manual frame-based semantic annotation, identifying lexical units, frames, and frame elements. A similarity metric combining spread activation and cosine similarity is employed to measure semantic overlap, showing that variations in frame evocation reflect different narrative and descriptive choices across scripts. This work demonstrates how frame-based models can capture semantic overlap and divergence in multimodal translation tasks such as audio description.
References
Bateman, J., Wildfeuer, J., and Hiippala, T. (2017) “Multimodality: Foundations, Research and Analysis A Problem-oriented Introduction”. Walter de Gruyter GmbH & Co KG. DOI: 10.1515/9783110479898.
Das, Dipanja, Chen, Desai, Martins, André; Schneider, Nathan and Smith, Noah A, (2014) “Frame-semantic parsing,” Computational linguistics, vol. 40, no. 1, pp. 9–56.
Dornelas, L. D., Gamonal, M. A., and Pagano, A. S. (2024) “Análise semântica de audiodescrição em curta metragem: uma abordagem multimodal a partir da Semântica de Frames”. Domínios de Linguagem, Uberlândia, v. 1866, p. 2-30. DOI: 10.34019/1808-9461.2022.v23.38564.
Dutra, L. V. (2024) “Evaluating the contribution of FrameNet to gender-based violence identification: How semantic annotation can be used as a resource for identifying patterns of violence”. Master’s Thesis (Master in Language Technology) – Department of Philosophy, Linguistics and Theory of Science, University of Gothenburg, Göteborg. Available in: [link].
EU NÃO QUERO VOLTAR SOZINHO. Curta Metragem. Direção: Daniel Ribeiro. Produção: Lacuna Filmes. Brasil: [s. n.], 2010. Disponível em: [link].
Fillmore, C. J. (1982). “Frame semantics”. In: Linguistic society of Korea (Ed.), Linguistics in the morning calm (pp. 111–137). Hanshin Publishing Co.
Gouws, S., van Rooyen, G.-J., and Engelbrecht, H. A. (2010) “Measuring conceptual similarity by spreading activation over Wikipedia’s hyperlink structure”. In: Proceedings of the 2nd Workshop on The People’s Web Meets NLP: Collaboratively Constructed Semantic Resources, pages 46–54, Beijing, China, August. Coling 2010 Organizing Committee.
Lenci, A., Sahlgren, M., Jeuniaux, P. et al. (2022) “A comparative evaluation and analysis of three generations of Distributional Semantic Models”. Lang Resources & Evaluation 56, 1269–1313. DOI: 10.1007/s10579-021-09575-z
Pagano, A. S., Teixeira, A. L. R. and Mayer, F. A. (2021) “Accessible Audiovisual Translation”, In: Meng Ji, and Sara Laviosa (eds), The Oxford Handbook of Translation and Social Practices, England. DOI: 10.1093/oxfordhb/9780190067205.013.4.
Saedi, C, Branco, A, Rodrigues, J. R., and Silva, J. (2018). “WordNet Embeddings”. In Proceedings of the Third Workshop on Representation Learning for NLP, pages 122–131, Melbourne, Australia. Association for Computational Linguistics.
Samagaio, M., Torrent T. T., Matos, E., and Lorenzi A. (2024) “Semantic Permanence in Audiovisual Translation: a FrameNet approach to subtitling”. In: Proceedings of the 16th International Conference on Computational Processing of Portuguese Vol. 1, p. 168–176, Santiago de Compostela, Galicia/Spain. Association for Computational Linguistics.
Souza, M. M. S., Gamonal, M. A, and Pagano, A. S. (2025) “Permanência semântica
entre áudio original e legenda: um estudo sobre anotação semântica multimodal em obra audiovisual”. Caligrama: Revista De Estudos Românicos, 30(1), 52-73. DOI: 10.35699/2317-2096.2025.57566
Viridiano, M., Torrent T. T., Czulo, O., Lorenzi A., Matos, E. and Belcavello, F. (2022) “The case for perspective in multimodal datasets”. In: Proceedings of the 1st Workshop on Perspectivist Approaches to NLP @LREC2022, pp. 108–116, Marseille, France. European Language Resources Association.
Torrent, T. T, Matos, E., Belcavello, F. Viridiano, M., Gamonal, M. A., DINIZ, A. and Coutinho, M. M. (2022) “Representing context in framenet: A multidimensional, multimodal approach”. In: Frontiers in Psychology, v. 13. DOI: 10.3389/fpsyg.2022.838441
Turton J., Smith R. E.;,Vinson D. (2021). Deriving Contextualised Semantic Features from BERT (and Other Transformer Model) Embeddings. In Proceedings of the 6th Workshop on Representation Learning for NLP (RepL4NLP-2021), pages 248–262, Online. Association for Computational Linguistics.
