A distributed checkpoint mechanism for replicated state machines


Çelikel N. Ö., OVATMAN T.

CLOSER 2020 - Proceedings of the 10th International Conference on Cloud Computing and Services Science, Prag, Çek Cumhuriyeti, 7 - 09 Mayıs 2020, ss.515-520 identifier

  • Yayın Türü: Bildiri / Tam Metin Bildiri
  • Doi Numarası: 10.5220/0009797405150520
  • Basıldığı Şehir: Prag
  • Basıldığı Ülke: Çek Cumhuriyeti
  • Sayfa Sayıları: ss.515-520
  • Anahtar Kelimeler: Replicated State Machines, Distributed Checkpoints, Fault Tolerance
  • İstanbul Teknik Üniversitesi Adresli: Evet

Özet

This study presents preliminary results from a distributed checkpointing approach developed to be used with replicated state machines. Our approach takes advantage from splitting and storing partial execution history of the master state machine in a distributed way. Our initial results show that using such an approach provides less memory consumption for both the running replicas and the restoring replica in case of a failure. On the other hand for larger histories and larger number of replicas it also increases the restore duration as a major drawback.