Assessing the Impact of Stemming Algorithms Applied to Brazilian Legislative Documents Retrieval

  • Ellen Souza UFRPE / USP
  • Gyovana Moriyama USP
  • Douglas Vitório UFRPE / UFPE
  • André C. P. L. F. de Carvalho USP
  • Nádia Félix USP / UFG
  • Hidelberg O. Albuquerque UFRPE / UFPE
  • Adriano L. I. Oliveira UFPE


The main purpose of stemming is to reduce the inflected words into its root form or stem. Thus, words can be mapped to the same concept, improving the process of information retrieval, regarding its ability to index documents and to reduce data dimensionality. However, the efficiency of those algorithms varies according to different aspects. Also, studies in the field area reached contrasting conclusions. This work assesses the use of stemmers in the retrieval of legislative documents written in Portuguese. Four stemmers together with BM25 were evaluated in two legislative corpora from the Brazilian Chamber of Deputies. RSLP-S and Savoy stemmers showed the best improvements in the information retrieval pipeline.


