Um Repositório Chave-Valor com Controle de Localidade

Patrick A. Bungama; Wendel M. de Oliveira; Flávio R. C. Sousa; Carmem S. Hara

doi:10.5753/sbbd.2016.24311

Patrick A. Bungama Universidade Federal do Paraná
Wendel M. de Oliveira Universidade Federal do Paraná
Flávio R. C. Sousa Universidade Federal do Ceará
Carmem S. Hara Universidade Federal do Paraná

DOI: https://doi.org/10.5753/sbbd.2016.24311

Resumo

O aumento no volume de dados produzidos apresenta desafios no armazenamento e processamento destes dados. Entretanto, soluções tradicionais de bancos de dados não se mostraram eficientes diante de tais desafios, principalmente no requisito de escalabilidade. Uma abordagem para prover escalabilidade é a adoção de uma arquitetura estratificada, que combina um sistema de armazenamento distribuído com uma interface simples para o acesso aos dados. Este artigo apresenta o ALOCS, um repositório de armazenamento distribuído de dados que adota o modelo chave-valor e que permite a aplicação usuária gerenciar o controle de localidade dos dados, reduzindo a comunicação no processamento de consultas. Os estudos experimentais mostram a melhoria no tempo de resposta das consultas utilizando a solução proposta.

Palavras-chave: ALOCS, Armazenamento chave-valor, Localidade de dados

Referências

Agrawal, D., El Abbadi, A., Antony, S., e Das, S. (2010). Data management challenges in cloud computing infrastructures. Databases in Networked Information Systems, páginas 1–10. Springer.

Arnaut, D. E., Schroeder, R., e Hara, C. S. (2011). Phoenix: A relational storage component for the cloud. Proc. of the 4th IEEE Int. Conf. on Cloud Computing, páginas 684–691.

Azagury, A., Dreizin, V., Factor, M., Henis, E., Naor, D., Rinetzky, N., Rodeh, O., Satran, J., Tavory, A., e Yerushalmi, L. (2003). Towards an object store. Proc. of the 20th IEEE Int. Conf. on Mass Storage Systems and Technologies, páginas 165–176.

Cattell, R. (2011). Scalable SQL and NoSQL data stores. SIGMOD Record, 39(4):12–27.

Corbett, J. C., Dean, J., et al. (2013). Spanner: Google’s globally distributed database. ACM Transactions on Computer Systems, 31(3):8.

de S. Rodrigues, C. A., de Almeida, J. F., Braganholo, V., e Mattoso, M. (2009). Consulta a bases XML distribuídas em P2P. Simpósio Brasileiro de Banco de Dados - Sessão de Demos, páginas 21–26.

Ghemawat, S., Gobioff, H., e Leung, S.-T. (2003). The Google file system. ACM SIGOPS Operating Systems Review, 37(5):29–43.

Junqueira, F. P. e Reed, B. C. (2009). The life and times of a Zookeeper. Proc. of the 28th ACM Symposium on Principles of Distributed Computing, página 4.

Ousterhout, J. K., Da Costa, H., Harrison, D., Kunze, J. A., Kupfer, M., e Thompson, J. G. (1985). A trace-driven analysis of the UNIX 4.2 BSD file system. ACM SIGOPS Operating Systems Review, 19(5):15–24.

Paiva, J. e Rodrigues, L. (2015). On Data Placement in Distributed Systems. ACM SIGOPS Operating Systems Review, 49.

Paiva, J., Ruivo, P., Romano, P., e Rodrigues, L. (2015). Auto Placer. ACM Transactions on Autonomous and Adaptive Systems, 9(4).

Pal, A. e Pal, M. (2009). Interval tree and its applications. Advanced Modeling and Optimization, 11(3):211–224.

Rani, L. S., Sudhakar, K., e Kumar, S. V. (2014). Distributed file systems: A survey. International Journal of Computer Science & Information Technologies, 5(3).

Ribas, E. A., Uba, R., Reinaldo, A. P., de Campos Jr, A., Arnaut, D., e Hara, C. (2011). Layering a dbms on a dht-based storage engine. Journal of Information and Data Management, 2(1):59–66.

Ross, R. B., Thakur, R., et al. (2000). PVFS: A parallel file system for linux clusters. Proceedings of the 4th annual Linux showcase and conference, páginas 391–430.

Schütt, T., Schintke, F., e Reinefeld, A. (2008). Scalaris: reliable transactional P2P key/value store. Proc. of the 7th ACM SIGPLAN Workshop on ERLANG, páginas 41–48.

Shvachko, K. V. (2010). HDFS scalability: The limits to growth. Login, 35(2):6–16.

Skeirik, S., Bobba, R. B., e Meseguer, J. (2013). Formal analysis of fault-tolerant group key management using ZooKeeper. Proc. of the 13th IEEE/ACM Int. Symp. on Cluster, Cloud and Grid Computing, páginas 636–641.

Tran, V.-T. (2013). Scalable data-management systems for Big Data. Tese de doutorado, École normale supérieure de Cachan.

Tudorica, B. G. e Bucur, C. (2011). A comparison between several NoSQL databases with comments and notes. Proc. of the 10th Roedunet Int. Conf., páginas 1–5.

Weil, S. A., Brandt, S. A., Miller, E. L., Long, D. D., e Maltzahn, C. (2006a). Ceph: A scalable, high-performance distributed file system. Proc. of the 7th Symp. on Operating Systems Design and Implementation, páginas 307–320.

Weil, S. A., Brandt, S. A., Miller, E. L., e Maltzahn, C. (2006b). Crush: Controlled, scalable, decentralized placement of replicated data. Proc. of the ACM/IEEE Conf. on Supercomputing, página 122.

Yin, S. e Kaynak, O. (2015). Big data for modern industry: Challenges and trends [Point of View]. Proc. of the IEEE, 103(2):143–146.

Zhang, H., Wen, Y., Xie, H., e Yu, N. (2013). A survey on distributed hash table (DHT): Theory, platforms, and applications. Relatório técnico, Nanyang Technological University.