Files in this item



application/pdf9314963.pdf (5MB)Restricted to U of Illinois
(no description provided)PDF


Title:A hybrid fault injection environment for measuring system dependability
Author(s):Young, Luke Titus
Doctoral Committee Chair(s):Iyer, Ravishankar K.
Department / Program:Electrical and Computer Engineering
Discipline:Electrical Engineering
Degree Granting Institution:University of Illinois at Urbana-Champaign
Subject(s):Engineering, Electronics and Electrical
Computer Science
Abstract:This thesis describes a test environment for evaluating computer system dependability, wherein faults are injected via software and the impact is measured by both software and hardware. The hybrid nature of the environment provides advantages in that it introduces minimal perturbation and provides a high degree of control over the location of faults to be injected. With this environment, faults can be injected into any location that has a physical address, e.g., CPU registers, cache, local memory, mass storage, and network controllers. Faults can also be injected into locations allocated to a single, executing user program or even into the kernel, and propagation can be characterized down to the instruction level. The environment is well suited for measuring extremely short error latencies. We illustrate the environment by applying it to the study of two commercial systems: A Unix-based, Tandem Integrity system and a Texas Instruments Explorer II Lisp machine.
Featured capabilities of the environment yielded several key results: High degrees of accuracy in measuring latency (within 20 ns) were obtained. Measurements of the sensitivity of different instructions to faults indicate a 5 percent chance that a faulted mips RISC instructions will not fail when executed. Modeling of multi-level error propagation show that error detections were due to multiple corruptions of state in as much as 57 percent of reads to wrong addresses and 37 percent of the writes to wrong addresses. The median latency associated with error detection by an individual CPU was on the order of 10 $\mu$s and that the median delay between detection and the start of CPU shutdown was on the order of 100ms. And Kernel fault injection studies show that a fault in the kernel is 2.6 times as likely to bring down a CPU as a fault elsewhere.
Issue Date:1993
Rights Information:Copyright 1993 Young, Luke Titus
Date Available in IDEALS:2011-05-07
Identifier in Online Catalog:AAI9314963
OCLC Identifier:(UMI)AAI9314963

This item appears in the following Collection(s)

Item Statistics