Identification of the Radical Component from Images of Chinese Characters

  • Yu Tzu Wu UNICAMP
  • Eric Fujiwara UNICAMP
  • Carlos K. Suzuki UNICAMP

Resumo


The Chinese writing system is made of pictographic symbols known as hanzi or Han characters which are combinations of single or multiple smaller units in varying degrees of complexity. For every hanzi, there is a fundamental block, namely the radical, which defines its meaning, usage, and how words are organized in the dictionary. Given its importance, this work proposes and demonstrates an automatic radical identifier and localizer of hanzi from images, reaching an average precision of ~78.0% and an average recall of ~74.3% for the 30 radicals with most entries in the dictionary and on a random excerpt of a historical novel.
Palavras-chave: Training, Location awareness, Dictionaries, Text categorization, Symbols, Memory, Information retrieval
Publicado
24/10/2022
WU, Yu Tzu; FUJIWARA, Eric; SUZUKI, Carlos K.. Identification of the Radical Component from Images of Chinese Characters. In: CONFERENCE ON GRAPHICS, PATTERNS AND IMAGES (SIBGRAPI), 35. , 2022, Natal/RN. Anais [...]. Porto Alegre: Sociedade Brasileira de Computação, 2022 .