Integrating External Autonomous Bases in Queries Processing on Distributed RDF Bases

  • Hugo Paulino Bonfim Takiuchi Federal University of Paraná
  • Raqueline Ritter de Moura Penteado State University of Maringá
  • Carmem Satie Hara Federal University of Paraná

Abstract


In RDF, a query can involve both data stored in third party autonomous databases and accessed through SPARQL endpoints, as well as data stored on a proprietary base. Federated systems process this type of query by accessing the external and proprietary databases as black-boxes. A moderator is responsible for sending sub-queries to the databases and combine their results. On the other hand, a traditional RDF system does not support accesses to external bases, but uses its proprietary base as a white-box, which allows optimizations in its internal processing strategies. This article proposes an alternative to the federated architecture, called FeSHyD, which explores a distributed proprietary base, allowing its servers to communicate with third-party databases during the query processing. The proposal promotes the parallel processing of queries on the proprietary basis, decentralizing the tasks of the moderator in a federated system. Initial experiments show that FeSHyD can reduce query response time when compared to federated systems.

Keywords: federated search, SPARQL query, distributed hybrid databases, distributed system integration, subquery ordering

References

Abdelaziz, I., Mansour, E., Ouzzani, M., Aboulnaga, A., and Kalnis, P. (2017). Lusail: a system for querying linked data at scale. Proc. of the VLDB Endowment, 11(4)

da Cunha, D. R. B. and Loscio, B. F. (2014). oLinDa: uma abordagem para decomposição de consultas em federações de dados interligados. In Anais do XXIX do SBDD

Gorlitz, O. and Staab, S. (2011). Splendid: SPARQL endpoint federation exploiting VoID descriptions. In Proc. of the 2nd Int. Conference on Consuming Linked Data.

Ladwig, G. and Tran, T. (2011). SIHJoin: Querying remote and local linked data. In The Semantic Web: Research and Applications. Springer Berlin Heidelberg.

Nikolov, A., Haase, P., Trame, J., and Kozlov, A. (2017). Ephedra: Efficiently combining RDF data and services using SPARQL federation. In Int. Conference on Knowledge Engineering and the Semantic Web. Springer

Penteado, R. R. M., Takiuchi, H. P. B., and Hara, C. S. (2019). PAbS: Um processador de consultas SPARQL sobre bases distribuídas. In Anais do XXXIV SBDD, Demos track.

Rakhmawati, N. A., Umbrich, J., Karnstedt, M., Hasnain, A., and Hausenblas, M. (2013). Querying over federated SPARQL endpoints-a state of the art survey. ArXiv:1306.1723

Schwarte, A., Haase, P., Hose, K., Schenkel, R., and Schmidt, M. (2011). Fedx: Optimization techniques for federated query processing on linked data. In Int. Semantic Web Conference.
Published
2020-09-28
TAKIUCHI, Hugo Paulino Bonfim; PENTEADO, Raqueline Ritter de Moura; HARA, Carmem Satie. Integrating External Autonomous Bases in Queries Processing on Distributed RDF Bases. In: BRAZILIAN SYMPOSIUM ON DATABASES (SBBD), 35. , 2020, Evento Online. Anais [...]. Porto Alegre: Sociedade Brasileira de Computação, 2020 . p. 37-48. ISSN 2763-8979. DOI: https://doi.org/10.5753/sbbd.2020.13623.