Experimentation and Analysis of Dynamic Checkpoint on Apache Hadoop with Failure Scenarios

  • Paulo Vinicius Cardoso UFSM
  • Patrícia Pitthan Barcelos UFSM

Resumen

The growth of reliability problems on high performance systems has motivated searches for fault tolerance mechanisms. The Apache Hadoop framework, created to store and process large amounts of data, implements Checkpoint and Recovery to help on recovery process of its distributed file system (Hadoop Distributed File System HDFS) in presence of failure. However, once configuration attributes can not be changed at runtime, bad choices may cause performance and reliability problems. This work uses a dynamic configuration mechanism for checkpoint on Hadoop and evaluates its performance on scenarios with induced fault on the master element of HDFS.
Publicado
2018-10-01
Cómo citar
CARDOSO, Paulo Vinicius; BARCELOS, Patrícia Pitthan. Experimentation and Analysis of Dynamic Checkpoint on Apache Hadoop with Failure Scenarios. Anais do Simpósio em Sistemas Computacionais de Alto Desempenho (SSCAD), [S.l.], p. 170-176, oct. 2018. ISSN 0000-0000. Disponible en: <https://sol.sbc.org.br/index.php/sscad/article/view/15658>. Fecha de acceso: 17 mayo 2024