HKPoly: A Polystore Architecture to Support Data Linkage and Queries on Distributed and Heterogeneous Data

  • Leonardo Guerreiro Azevedo IBM Research Brazil
  • Renan Souza Oak Ridge National Laboratory
  • Elton Soares IBM Research Brazil
  • Raphael Melo Thiago IBM Research Brazil
  • Julio Cesar Cardoso Tesolin IME
  • Anna Carolina Carvalho Moreira Oliveira UFRJ
  • Marcio Ferreira Moreno MOBR Systems

Resumo


Context: Modern information systems commonly manipulate heterogeneous data and schemas fragmented in the data stores that best fit their storage and access requirements. Besides, different organizations’ business processes independently consume these fragments without explicit links between the employed data. Problem: Supporting heterogeneous and not explicitly connected data residing in distinct data repositories is a big challenge. Solution: This work proposes HKPoly: a federated architecture that encapsulates data heterogeneity, location, and linkage. IS Theory: We employed the Representation theory to create the models of the architecture and its components. Method: Architecture implementation, its application in an Oil & Gas scenario, and its comparison to a multi-database system. Results: The proposal allows query writing to be two times less complex than the one written for the relational multi-database system, adding an excess of about 30% in query processing time. Contributions: An architecture to query heterogeneous data, the requirements and components for its implementation, and an implementation example using the stated-of-the-art concepts.

Palavras-chave: Business process, Database integration, Distributed databases, Microservices, Provenance, Query processing
Publicado
20/05/2024
AZEVEDO, Leonardo Guerreiro; SOUZA, Renan; SOARES, Elton; THIAGO, Raphael Melo; TESOLIN, Julio Cesar Cardoso; OLIVEIRA, Anna Carolina Carvalho Moreira; MORENO, Marcio Ferreira. HKPoly: A Polystore Architecture to Support Data Linkage and Queries on Distributed and Heterogeneous Data. In: SIMPÓSIO BRASILEIRO DE SISTEMAS DE INFORMAÇÃO (SBSI), 20. , 2024, Juiz de Fora/MG. Anais [...]. Porto Alegre: Sociedade Brasileira de Computação, 2024 .