Files in this item



application/pdfpaper.pdf (292kB)


Title:Confluence: A System for Lossless Multi-Source Single-Sink Data Collection
Author(s):Patel, Jay A.; Cho, Brian; Gupta, Indranil
Subject(s):Distributed Systems
Computer Networks
Abstract:Distributed environments often require collection of large amounts of critical and raw data from multiple locations to a central clearinghouse, e.g., task results or large datasets from multiple clouds, logs from multiple PlanetLab nodes, video transcripts in tele-immersive settings, etc. We present the design, implementation and evaluation of Confluence, a system for rapid and lossless transfer of unique files from multiple source nodes to a single sink node. First, we formally model the multi-source single-sink data collection problem for a static network and present an optimal solution in terms of total transfer time. Second, we build in mechanisms to make the system workable in dynamic networks. The resulting Confluence system builds an adaptive source-2-source (s2s) overlay amongst participating nodes, which exploits spatial as well as temporal heterogeneity of available bandwidth.We conduct an evaluation of Confluence on PlanetLab traces in ns-2. Results show that Confluence can improve total transfer time by as much as 40% (with up to 50 sources).
Issue Date:2009-07-24
Genre:Technical Report
Publication Status:unpublished
Peer Reviewed:not peer reviewed
Date Available in IDEALS:2009-07-24

This item appears in the following Collection(s)

Item Statistics