Files in this item
Files | Description | Format |
---|---|---|
application/pdf ![]() |
Description
Title: | Zorro: Zero-Cost Reactive Failure Recovery in Distributed Graph Processing |
Author(s): | Pundir, Mayank; Leslie, Luke M.; Gupta, Indranil; Campbell, Roy H. |
Subject(s): | Distributed graph processing
Failure recovery Reactive approaches Checkpointing |
Abstract: | Distributed graph processing systems largely rely on proactive techniques for failure recovery. Unfortunately, these approaches (such as checkpointing) entail a significant overhead. In this paper, we argue that distributed graph processing systems should instead use a reactive approach to failure recovery. The reactive approach trades off completeness of the result (generating a slightly inaccurate result) while reducing the overhead during failure-free execution to zero. We build a system called Zorro that imbues this reactive approach, and integrate Zorro into two graph processing systems – PowerGraph and LFGraph. When a failure occurs, Zorro opportunistically exploits vertex replication (inherent in today’s graph processing systems) to quickly rebuild the state of failed servers. Experiments using real-world graphs demonstrate that Zorro is able to recover over 99% of the graph state when a few servers fail, and between 87-92% when half the cluster fails. Furthermore, using eight common graph processing algorithms, Zorro incurs little to no accuracy loss in all experimental failure scenarios. |
Issue Date: | 2015-05-07 |
Genre: | Technical Report |
Type: | Text |
Language: | English |
URI: | http://hdl.handle.net/2142/75959 |
Date Available in IDEALS: | 2015-05-07 |