Building a Frame-Semantic Model of the Healthcare Domain: Towards the identification of gender-based violence in public health data

  • Lívia Dutra UFJF / Göteborgs Universitet
  • Arthur Lorenzi UFJF
  • Lorena Larré UFJF
  • Frederico Belcavello UFJF
  • Ely Matos UFJF
  • Amanda Pestana UFJF
  • Kenneth Brown UFJF
  • Mariana Gonçalves UFJF
  • Victor Herbst UFJF
  • Sofia Reinach Vital Strategies Brasil
  • Renato Teixeira Vital Strategies Brasil
  • Pedro de Paula Vital Strategies Brasil
  • Alessandra Pellini USCS
  • Cibele Sequeira Secretaria da Saúde do Município de São Caetano do Sul
  • Ester Sabino USCS
  • Fábio Leal USCS
  • Mônica Conde USCS
  • Regina Grespan Secretaria da Saúde do Município de São Caetano do Sul
  • Tiago Torrent UFJF / CNPq


Public data systems gather a series of different information about Brazilian citizens. Such information is inserted in the system both via the selection of parameterized options and via open text fields. In this paper we describe the effort of modeling semantic frames for the lexicon of the healthcare domain as a means of tagging the open text fields in public health data so as to make them more easily interpretable by machine learning systems. This effort is one of the steps in a larger project aiming at using data science and machine learning techniques for the identification of territories prone to suffer from gender based-violence. The modeling effort currently covers 1,787 lexical units in the healthcare domain in Brazilian Portuguese, distributed in 29 semantic frames.

Palavras-chave: Semantics and Pragmatics, Corpus Annotation, Lexicography, Lexicology, Terminology


Costa, Alexandre Diniz. (2020) “A tradução por máquina enriquecida semanticamente com frames e papéis qualia.” (Ph.D. thesis in Linguistics. Universidade Federal de Juiz de Fora, Juiz de Fora.)

Fillmore, Charles J.(1982) The case for case reopened. In: Grammatical relations. Brill, 1977. p. 59-81.

Fillmore, C. J. (1982). Frame semantics. In: Linguistic Society of Korea (ed.), “Linguistics in The Morning Calm”. Seoul: Hanshin, p.111-138.

Garbin, Cléa Adas Saliba et al. (2015) “Desafios do profissional de saúde na notificação da violência: obrigatoriedade, ef

Kilgarriff, Adam et al. (2014) “The Sketch Engine: ten years on.” In: Lexicography, v. 1, n. 1, p. 7-36.

Kilgarriff, Adam et al. (2014) “PtTenTen: A corpus for Portuguese lexicography.” In: Working with Portuguese Corpora, p. 111-30.

Kind, Luciana et al. (2013) “Subnotificação e (in) visibilidade da violência contra mulheres na atenção primária à saúde.” In: Cadernos de Saúde Pública, v. 29, p. 1805-1815.

Pustejovsky, James.(1998) The generative lexicon. MIT press.

Ruppenhofer, Josef et al. (2016) Framenet II: Extended Theory And Practice. [link].

Torrent, Tiago Timponi et al. (2014) “Multilingual lexicographic annotation for domain-specific electronic dictionaries: The Copa 2014 FrameNet Brasil project.” In: Constructions and Frames, v. 6, n. 1, p. 73-91.

Torrent, Tiago Timponi et al. (2022) “Representing context in framenet: A multidimensional, multimodal approach.” In: Frontiers in Psychology, v. 13.
Como Citar

Selecione um Formato
DUTRA, Lívia et al. Building a Frame-Semantic Model of the Healthcare Domain: Towards the identification of gender-based violence in public health data. In: SIMPÓSIO BRASILEIRO DE TECNOLOGIA DA INFORMAÇÃO E DA LINGUAGEM HUMANA (STIL), 14. , 2023, Belo Horizonte/MG. Anais [...]. Porto Alegre: Sociedade Brasileira de Computação, 2023 . p. 338-346. DOI: