Files in this item

FilesDescriptionFormat

application/pdf

application/pdfStochastic-Mode ... in Distributed Systems.pdf (2MB)
(no description provided)PDF

Description

Title:Stochastic-Model-Driven Adaptation and Recovery in Distributed Systems
Author(s):Joshi, Kaustubh Raghunandan
Subject(s):distributed systems
Abstract:Dependability is becoming a requirement in an increasing number of domains, including those that were previously thought to be noncritical. Examples include large distributed systems deployed in domains such as e-commerce, information mining, messaging, and entertainment. Such systems provide a challenge to existing fault tolerance approaches because of their requirements for low-cost solutions that can be adapted to work with off-the-shelf components. At the same time, their scale makes it difficult to accurately diagnose faults and recover from them. This dissertation proposes a model-based solution to building a theoretically well-founded recovery framework based on partially observable Markov decision processes that is inexpensive to deploy, can cope with a variety of recovery mechanisms, and can tolerate system monitoring that may be imperfect, imprecise, or conflicting, and at the same time can generate recovery decisions that ensure that recovery will be stable, provide guarantees on the success of the recovery, and recover the system while incurring as low a cost as possible, thus approximating optimality. We are unaware of any other framework for recovery in distributed systems that integrates monitoring and recovery in an iterative manner, is able to deal with imprecise system states and selectively choose actions that either gather information or make progress towards recovery, and generates recovery policies that minimize costs over entire sequences of recovery actions. We have implemented a tool called the .Adaptation and Recovery Management framework. that implements our approach. We demonstrate that this tool can be used to provide diagnosis and recovery capabilities in practical information systems.
Issue Date:2007-05
Genre:Technical Report
Type:Text
URI:http://hdl.handle.net/2142/11321
Other Identifier(s):UIUCDCS-R-2007-2832
Rights Information:You are granted permission for the non-commercial reproduction, distribution, display, and performance of this technical report in any format, BUT this permission is only for a period of 45 (forty-five) days from the most recent time that you verified that this technical report is still available from the University of Illinois at Urbana-Champaign Computer Science Department under terms that include this permission. All other rights are reserved by the author(s).
Date Available in IDEALS:2009-04-22


This item appears in the following Collection(s)

Item Statistics