A Key-Value Repository with Location Control
Abstract
The ever increasing volume of data produced nowadays presents challenges for storing and processing this data. Traditional database solutions are not efficient to face these challenges, especially with respect to scalability. One approach to provide scalability is the adoption of a layered architecture which combines a distributed storage system with a simple access interface. This paper presents ALOCS, a distributed repository which adopts the key-value model, allowing the user application to control the allocation of data into servers. The goal is to allow the application to co-allocate data that are frequently used together. Our experimental study shows that ALOCS improves query response times by reducing the amount of remote data accesses.
Keywords:
ALOCS, Key-Value Storage, Data Locality
References
Agrawal, D., El Abbadi, A., Antony, S., e Das, S. (2010). Data management challenges in cloud computing infrastructures. Databases in Networked Information Systems, páginas 1–10. Springer.
Arnaut, D. E., Schroeder, R., e Hara, C. S. (2011). Phoenix: A relational storage component for the cloud. Proc. of the 4th IEEE Int. Conf. on Cloud Computing, páginas 684–691.
Azagury, A., Dreizin, V., Factor, M., Henis, E., Naor, D., Rinetzky, N., Rodeh, O., Satran, J., Tavory, A., e Yerushalmi, L. (2003). Towards an object store. Proc. of the 20th IEEE Int. Conf. on Mass Storage Systems and Technologies, páginas 165–176.
Cattell, R. (2011). Scalable SQL and NoSQL data stores. SIGMOD Record, 39(4):12–27.
Corbett, J. C., Dean, J., et al. (2013). Spanner: Google’s globally distributed database. ACM Transactions on Computer Systems, 31(3):8.
de S. Rodrigues, C. A., de Almeida, J. F., Braganholo, V., e Mattoso, M. (2009). Consulta a bases XML distribuídas em P2P. Simpósio Brasileiro de Banco de Dados - Sessão de Demos, páginas 21–26.
Ghemawat, S., Gobioff, H., e Leung, S.-T. (2003). The Google file system. ACM SIGOPS Operating Systems Review, 37(5):29–43.
Junqueira, F. P. e Reed, B. C. (2009). The life and times of a Zookeeper. Proc. of the 28th ACM Symposium on Principles of Distributed Computing, página 4.
Ousterhout, J. K., Da Costa, H., Harrison, D., Kunze, J. A., Kupfer, M., e Thompson, J. G. (1985). A trace-driven analysis of the UNIX 4.2 BSD file system. ACM SIGOPS Operating Systems Review, 19(5):15–24.
Paiva, J. e Rodrigues, L. (2015). On Data Placement in Distributed Systems. ACM SIGOPS Operating Systems Review, 49.
Paiva, J., Ruivo, P., Romano, P., e Rodrigues, L. (2015). Auto Placer. ACM Transactions on Autonomous and Adaptive Systems, 9(4).
Pal, A. e Pal, M. (2009). Interval tree and its applications. Advanced Modeling and Optimization, 11(3):211–224.
Rani, L. S., Sudhakar, K., e Kumar, S. V. (2014). Distributed file systems: A survey. International Journal of Computer Science & Information Technologies, 5(3).
Ribas, E. A., Uba, R., Reinaldo, A. P., de Campos Jr, A., Arnaut, D., e Hara, C. (2011). Layering a dbms on a dht-based storage engine. Journal of Information and Data Management, 2(1):59–66.
Ross, R. B., Thakur, R., et al. (2000). PVFS: A parallel file system for linux clusters. Proceedings of the 4th annual Linux showcase and conference, páginas 391–430.
Schütt, T., Schintke, F., e Reinefeld, A. (2008). Scalaris: reliable transactional P2P key/value store. Proc. of the 7th ACM SIGPLAN Workshop on ERLANG, páginas 41–48.
Shvachko, K. V. (2010). HDFS scalability: The limits to growth. Login, 35(2):6–16.
Skeirik, S., Bobba, R. B., e Meseguer, J. (2013). Formal analysis of fault-tolerant group key management using ZooKeeper. Proc. of the 13th IEEE/ACM Int. Symp. on Cluster, Cloud and Grid Computing, páginas 636–641.
Tran, V.-T. (2013). Scalable data-management systems for Big Data. Tese de doutorado, École normale supérieure de Cachan.
Tudorica, B. G. e Bucur, C. (2011). A comparison between several NoSQL databases with comments and notes. Proc. of the 10th Roedunet Int. Conf., páginas 1–5.
Weil, S. A., Brandt, S. A., Miller, E. L., Long, D. D., e Maltzahn, C. (2006a). Ceph: A scalable, high-performance distributed file system. Proc. of the 7th Symp. on Operating Systems Design and Implementation, páginas 307–320.
Weil, S. A., Brandt, S. A., Miller, E. L., e Maltzahn, C. (2006b). Crush: Controlled, scalable, decentralized placement of replicated data. Proc. of the ACM/IEEE Conf. on Supercomputing, página 122.
Yin, S. e Kaynak, O. (2015). Big data for modern industry: Challenges and trends [Point of View]. Proc. of the IEEE, 103(2):143–146.
Zhang, H., Wen, Y., Xie, H., e Yu, N. (2013). A survey on distributed hash table (DHT): Theory, platforms, and applications. Relatório técnico, Nanyang Technological University.
Arnaut, D. E., Schroeder, R., e Hara, C. S. (2011). Phoenix: A relational storage component for the cloud. Proc. of the 4th IEEE Int. Conf. on Cloud Computing, páginas 684–691.
Azagury, A., Dreizin, V., Factor, M., Henis, E., Naor, D., Rinetzky, N., Rodeh, O., Satran, J., Tavory, A., e Yerushalmi, L. (2003). Towards an object store. Proc. of the 20th IEEE Int. Conf. on Mass Storage Systems and Technologies, páginas 165–176.
Cattell, R. (2011). Scalable SQL and NoSQL data stores. SIGMOD Record, 39(4):12–27.
Corbett, J. C., Dean, J., et al. (2013). Spanner: Google’s globally distributed database. ACM Transactions on Computer Systems, 31(3):8.
de S. Rodrigues, C. A., de Almeida, J. F., Braganholo, V., e Mattoso, M. (2009). Consulta a bases XML distribuídas em P2P. Simpósio Brasileiro de Banco de Dados - Sessão de Demos, páginas 21–26.
Ghemawat, S., Gobioff, H., e Leung, S.-T. (2003). The Google file system. ACM SIGOPS Operating Systems Review, 37(5):29–43.
Junqueira, F. P. e Reed, B. C. (2009). The life and times of a Zookeeper. Proc. of the 28th ACM Symposium on Principles of Distributed Computing, página 4.
Ousterhout, J. K., Da Costa, H., Harrison, D., Kunze, J. A., Kupfer, M., e Thompson, J. G. (1985). A trace-driven analysis of the UNIX 4.2 BSD file system. ACM SIGOPS Operating Systems Review, 19(5):15–24.
Paiva, J. e Rodrigues, L. (2015). On Data Placement in Distributed Systems. ACM SIGOPS Operating Systems Review, 49.
Paiva, J., Ruivo, P., Romano, P., e Rodrigues, L. (2015). Auto Placer. ACM Transactions on Autonomous and Adaptive Systems, 9(4).
Pal, A. e Pal, M. (2009). Interval tree and its applications. Advanced Modeling and Optimization, 11(3):211–224.
Rani, L. S., Sudhakar, K., e Kumar, S. V. (2014). Distributed file systems: A survey. International Journal of Computer Science & Information Technologies, 5(3).
Ribas, E. A., Uba, R., Reinaldo, A. P., de Campos Jr, A., Arnaut, D., e Hara, C. (2011). Layering a dbms on a dht-based storage engine. Journal of Information and Data Management, 2(1):59–66.
Ross, R. B., Thakur, R., et al. (2000). PVFS: A parallel file system for linux clusters. Proceedings of the 4th annual Linux showcase and conference, páginas 391–430.
Schütt, T., Schintke, F., e Reinefeld, A. (2008). Scalaris: reliable transactional P2P key/value store. Proc. of the 7th ACM SIGPLAN Workshop on ERLANG, páginas 41–48.
Shvachko, K. V. (2010). HDFS scalability: The limits to growth. Login, 35(2):6–16.
Skeirik, S., Bobba, R. B., e Meseguer, J. (2013). Formal analysis of fault-tolerant group key management using ZooKeeper. Proc. of the 13th IEEE/ACM Int. Symp. on Cluster, Cloud and Grid Computing, páginas 636–641.
Tran, V.-T. (2013). Scalable data-management systems for Big Data. Tese de doutorado, École normale supérieure de Cachan.
Tudorica, B. G. e Bucur, C. (2011). A comparison between several NoSQL databases with comments and notes. Proc. of the 10th Roedunet Int. Conf., páginas 1–5.
Weil, S. A., Brandt, S. A., Miller, E. L., Long, D. D., e Maltzahn, C. (2006a). Ceph: A scalable, high-performance distributed file system. Proc. of the 7th Symp. on Operating Systems Design and Implementation, páginas 307–320.
Weil, S. A., Brandt, S. A., Miller, E. L., e Maltzahn, C. (2006b). Crush: Controlled, scalable, decentralized placement of replicated data. Proc. of the ACM/IEEE Conf. on Supercomputing, página 122.
Yin, S. e Kaynak, O. (2015). Big data for modern industry: Challenges and trends [Point of View]. Proc. of the IEEE, 103(2):143–146.
Zhang, H., Wen, Y., Xie, H., e Yu, N. (2013). A survey on distributed hash table (DHT): Theory, platforms, and applications. Relatório técnico, Nanyang Technological University.
Published
2016-10-04
How to Cite
BUNGAMA, Patrick A.; DE OLIVEIRA, Wendel M.; SOUSA, Flávio R. C.; HARA, Carmem S..
A Key-Value Repository with Location Control. In: BRAZILIAN SYMPOSIUM ON DATABASES (SBBD), 31. , 2016, Salvador/BA.
Anais [...].
Porto Alegre: Sociedade Brasileira de Computação,
2016
.
p. 88-99.
ISSN 2763-8979.
DOI: https://doi.org/10.5753/sbbd.2016.24311.
