A distributed checkpoint mechanism for replicated state machines


Çelikel N. Ö. , OVATMAN T.

CLOSER 2020 - Proceedings of the 10th International Conference on Cloud Computing and Services Science, Prag, Czech Republic, 7 - 09 May 2020, pp.515-520 identifier

  • Publication Type: Conference Paper / Full Text
  • Doi Number: 10.5220/0009797405150520
  • City: Prag
  • Country: Czech Republic
  • Page Numbers: pp.515-520
  • Keywords: Replicated State Machines, Distributed Checkpoints, Fault Tolerance

Abstract

This study presents preliminary results from a distributed checkpointing approach developed to be used with replicated state machines. Our approach takes advantage from splitting and storing partial execution history of the master state machine in a distributed way. Our initial results show that using such an approach provides less memory consumption for both the running replicas and the restoring replica in case of a failure. On the other hand for larger histories and larger number of replicas it also increases the restore duration as a major drawback.