Racks Reliability and Data Availability as Metrics for Replica Balancing of an HDFS cluster

Abstract


Data replication is essential to achieve reliability on the HDFS, but it can intensify cluster imbalance. The HDFS Balancer is an Apache Hadoop daemon designed to perform replica balancing. The balancer, however, is not optimized to meet the potential demands of fault tolerance and data availability during data redistribution. This work presents a customization for the HDFS Balancer that evaluates the system’s racks based on the failure rate of its DNs to determine which nodes should receive more or less data. In association, we used a priority that customizes the selection and redistribution of blocks during the balancing aiming at increasing the final availability of the data in the cluster.

Keywords: big data, replica balancing, fault tolerance and resilience, reliability, availability

References

Achari, S. (2015). Hadoop Essentials. Packt Publishing Ltd, Birmingham, 1st edition.

Dharanipragada, J., Padala, S., Kammili, B., and Kumar, V. (2017). Tula: A disk latency aware balancing and block placement strategy for hadoop. In International Conference on Big Data, pages 2853–2858. IEEE.

Fazul, R. W. A., Cardoso, P. V., and Barcelos, P. P. (2019a). Improving data availability in hdfs through replica balancing. In 2019 9th Latin-American Symposium on Dependable Computing (LADC), pages 1–6. IEEE.

Fazul, R. W. A., Cardoso, P. V., and Barcelos, P. P. (2019b). O balanceamento de réplicas em um cluster hdfs com base na confiabilidade dos racks. In Anais do Simpósio Brasileiro de Engenharia de Sistemas Computacionais (SBESC 2019), pages 31–38. SBC.

Foundation, A. S. (2019). “HDFS Architecture”. hadoop.apache.org/docs/r2.9.2/hadoop-project-dist/hadoop-hdfs/HdfsDesign. Novembro.

Hortonworks (2019). “Balancing data across an HDFS cluster”. https://docs.cloudera.com/HDPDocuments/HDP3/HDP-3.1.4/data-storage/content/balancing_data_across_hdfs_cluster.html. Dezembro.

Ibrahim, I. A., Dai, W., and Bassiouni, M. (2016). Intelligent data placement mechanism for replicas distribution in cloud storage systems. In IEEE International Conference on Smart Cloud (SmartCloud), pages 134–139, New York. IEEE.

Liu, K., Xu, G., and Yuan, J. (2013). An improved hadoop data load balancing algorithm. Journal of Networks, 8(12):2816–2822.

Shah, A. and Padole, M. (2018). Load balancing through block rearrangement policy for hadoop heterogeneous cluster. In 2018 Int. Conference on Advances in Computing, Communications and Informatics (ICACCI), pages 230–236, Bangalore. IEEE.

Shvachko, K., Kuang, H., Radia, S., and Chansler, R. (2010). The hadoop distributed file system. In Symposium on Mass Storage Systems and Technologies, pages 1–10. IEEE.

Turkington, G. (2013). Hadoop Beginner’s Guide. Packt Publishing Ltd, 1 edition.

White, T. (2015). Hadoop: The Definitive Guide. O’Reilly Media, Inc., 4 edition.
Published
2020-12-07
FAZUL, Rhauani Weber Aita; BARCELOS, Patrícia Pitthan. Racks Reliability and Data Availability as Metrics for Replica Balancing of an HDFS cluster. In: BRAZILIAN SYMPOSIUM ON COMPUTER NETWORKS AND DISTRIBUTED SYSTEMS (SBRC), 38. , 2020, Rio de Janeiro. Anais [...]. Porto Alegre: Sociedade Brasileira de Computação, 2020 . p. 1-14. ISSN 2177-9384. DOI: https://doi.org/10.5753/sbrc.2020.12269.