Files in this item



application/pdfZorro__IDEALS_6.4.15.pdf (1MB)


Title:Zorro: Zero-Cost Reactive Failure Recovery in Distributed Graph Processing
Author(s):Pundir, Mayank; Leslie, Luke M.; Gupta, Indranil; Campbell, Roy H.
Subject(s):Distributed graph processing
Failure recovery
Reactive approaches
Abstract:Distributed graph processing systems largely rely on proactive techniques for failure recovery. Unfortunately, these approaches (such as checkpointing) entail a significant overhead. In this paper, we argue that distributed graph processing systems should instead use a reactive approach to failure recovery. The reactive approach trades off completeness of the result (generating a slightly inaccurate result) while reducing the overhead during failure-free execution to zero. We build a system called Zorro that imbues this reactive approach, and integrate Zorro into two graph processing systems – PowerGraph and LFGraph. When a failure occurs, Zorro opportunistically exploits vertex replication (inherent in today’s graph processing systems) to quickly rebuild the state of failed servers. Experiments using real-world graphs demonstrate that Zorro is able to recover over 99% of the graph state when a few servers fail, and between 87-92% when half the cluster fails. Furthermore, using eight common graph processing algorithms, Zorro incurs little to no accuracy loss in all experimental failure scenarios.
Issue Date:2015-05-07
Genre:Technical Report
Date Available in IDEALS:2015-05-07

This item appears in the following Collection(s)

Item Statistics