JCLexT: A Java Tool for Compiling Finite-State Transducers from Full-Form Lexicons

  • Leonel F. de Alencar UFC
  • Philipp B. Costa UFC
  • Mardonio J. C. França UFC
  • Alexander Ewart UFC
  • Katiuscia M. Andrade UFC
  • Rossana M. C. Andrade UFC

Abstract


JCLexT is a compiler of finite-state transducers from full-form lexicons, this tool seems to be the first Java implementation of such functionality. A comparison between JCLexT and Foma was performed based on extensive data from Portuguese. The main disadvantage of JCLexT is the slower compilation time, in comparison to Foma. However, this is negated by the fact that a large transducer compiled with JCLexT was shown to be 8.6% smaller than the Foma created counterpart.

Published
2015-11-04
ALENCAR, Leonel F. de; COSTA, Philipp B.; FRANÇA, Mardonio J. C.; EWART, Alexander; ANDRADE, Katiuscia M.; ANDRADE, Rossana M. C.. JCLexT: A Java Tool for Compiling Finite-State Transducers from Full-Form Lexicons. In: BRAZILIAN SYMPOSIUM IN INFORMATION AND HUMAN LANGUAGE TECHNOLOGY (STIL), 1. , 2015, Natal/RN. Anais [...]. Porto Alegre: Sociedade Brasileira de Computação, 2015 . p. 15-19.