Comparing Alternative Storage Models for Words Extracted from Legal Texts

  • Ana Paula Sodré Universidade Federal do Paraná (UFPR)
  • Luis Eduardo Mochenski Floriano Universidade Federal do Paraná (UFPR)
  • Dimmy Magalhães Universidade Federal do Paraná (UFPR)
  • Cristina D. Aguiar Universidade de São Paulo (USP)
  • Aurora Pozo Universidade Federal do Paraná (UFPR)
  • Carmem S. Hara Universidade Federal do Paraná (UFPR)

Resumo


The COVID-19 pandemic created new demands for services in the judicial system, requiring the use of a data warehouse (DW). Although there exist approaches that use DW in the judicial domain, few target the pandemic or publicly provide the information extracted from the texts. Following the needs of a legal expert, we have developed the COVID-19 Portal. It extracts documents from the Supreme Federal Court in Brazil to obtain quantitative information on words used in the texts. In this paper, we present the design of a DW, and show the query performance improvement achieved with its implementation. The DW has been developed on Postgres, and its performance is compared with the original implementation on MongoDB Cloud and a local MongoDB database.
Palavras-chave: datawarehouse, COVID-19, legal text

Referências

Bruzarosco, D. C., Castoldi, A. V., and Pacheco, R. C. d. S. (2000). Developing data warehouse using dimensional model. Acta Scientiarum, 22:1389–1397.

Chaudhuri, S. and Dayal, U. (1997). An overview of data warehousing and OLAP technology. SIGMOD Record, 26(1):65–74.

Kimball, R. and Ross, M. (2013). The Data Warehouse Toolkit: The Definitive Guide to Dimensional Modeling. Wiley, 3rd edition.

Kunda, D. and Phiri, H. (2017). A comparative study of NoSQL and relational database. Zambia Information Communication Technology (ICT) Journal, 1(1):1–4.

Mohamed, M. A., Altrafi, O. G., and Ismail, M. O. (2014). Relational vs. NoSQL databases: A survey. International Journal of Computer and Information Technology, 3(3):598–601.

Rocha, G. M. and Ciferri, C. D. A. (2020). Efficient processing of analytical queries extended with similarity search predicates over images in spark. Journal of Information and Data Management, 11(3):209–227.
Publicado
04/10/2021
SODRÉ, Ana Paula; FLORIANO, Luis Eduardo Mochenski; MAGALHÃES, Dimmy; AGUIAR, Cristina D.; POZO, Aurora; HARA, Carmem S.. Comparing Alternative Storage Models for Words Extracted from Legal Texts. In: WORKSHOP DE TRABALHOS DE ALUNOS DA GRADUAÇÃO (WTAG) - SIMPÓSIO BRASILEIRO DE BANCO DE DADOS (SBBD), 36. , 2021, Rio de Janeiro. Anais [...]. Porto Alegre: Sociedade Brasileira de Computação, 2021 . p. 36-42. DOI: https://doi.org/10.5753/sbbd_estendido.2021.18160.