Fast and Scalable Relational Division on Database Systems
Resumo
The Relational Algebra is composed of several operators to assist queries and data manipulation on Relational Databases. The Relational Division operator, particularly, allows simple representations of several queries involving the concept of “for all”, however, the SQL does not have an explicit implementation for it. In this paper, we compare the performance of the best implementation known for the division operator in SQL, considering different cases of use. We also present a new algorithm for the division, which we implemented through stored procedures. We performed a case study using the relational division to select genetic data. The results showed that our implementation for the relational division is potentially faster than the best implementation in SQL.
Referências
Celko, J. (2009). Divided we stand: The sql of relational division. Simple-Talk. [Online; acessed April 26,2016].
Codd, E. F. (1972). Relational completeness of data base sublanguages. In: R. Rustin (ed.): Database Systems: 65-98, Prentice Hall and IBM Research RJ 987, San Jose, California.
Gonzaga, A. (2014). Study aimed at simplification and optimization of relational division in database systems. Term Paper, University of São Paulo, São Carlos, Brazil.
Leinders, D. and Van den Bussche, J. (2005). On the complexity of division and set joins in the relational algebra. In Proceedings of the twenty-fourth ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems, pages 76–83. ACM.
Matos, V. M. and Grasser, R. (2001). Assessing performance of the relational division operator. Data Base Management, 22-20-30:1–11.
Zhang, K., Qin, Z. S., Liu, J. S., Chen, T., Waterman, M. S., and Sun, F. (2004). Haplotype block partitioning and tag snp selection using genotype data and their applications to association studies. Genome Research, 14(5):908–916.