Files in this item



application/pdfYin_Zuoning.pdf (910kB)
(no description provided)PDF


Title:Characterizing system failures in commercial and open source systems
Author(s):Yin, Zuoning
Director of Research:Zhou, Yuanyuan
Doctoral Committee Chair(s):Zhou, Yuanyuan
Doctoral Committee Member(s):Caesar, Matthew C.; Voelker, Geoffrey M.; Zhai, ChengXiang
Department / Program:Computer Science
Discipline:Computer Science
Degree Granting Institution:University of Illinois at Urbana-Champaign
Subject(s):System Reliability
System Manageability
Characteristic study
Abstract:With the advance of technology, current systems are becoming much more powerful in computation, much faster in data transfer and much more abundant in data storage. However, what have been left behind are the system reliability and manageability. Current systems still fail quite often in the field. Understanding the characteristics of system failures is a prerequisite to come up with effective solutions to address these system problems. This thesis focuses on the failures introduced by incorrect bug fixes (a.k.a, buggy patches) and configuration errors. Bug fixing is done by humans, therefore it can also introduce mistakes, which are incorrect fixes. These incorrect fixes not only further aggravate the damage to end users, but also poison software vendors' reputation. Therefore, we did one of the most comprehensive characteristic studies on incorrect bug-fixes from four large operating system code bases, including a commercial OS project. We studied the ratio and impact of incorrect fixes, and found incorrect fix is a significant problem that requires special attention. We also studied the common patterns of mistakes made during bug fixing that can be used to alert the programmers as well as to design detection tools to catch these incorrect fixes. We finally studied the code knowledge of developers and found inadequate code knowledge may increase the chance of incorrect fixes. Configuration error (i.e., misconfigurations) is another dominant cause of system failures. Unfortunately, the characteristics of misconfigurations have been rarely studied in the past. Therefore, we took the initiative to conduct a real-world misconfiguration characteristic study. We studied a total of 546 misconfiguration cases, including 309 cases from a commercial storage system deployed at thousands of customers and 237 cases from four widely used open source systems (CentOS, MySQL, Apache HTTP Server, and OpenLDAP). Our study covers several dimensions of misconfigurations, including types, causes, impact, and system reactions. Some of our major findings include: 1) a majority of misconfigurations are due to mistakes in setting configuration parameters; however, non-parameter mistakes are still sizable. 2) 38.1%~53.7% of parameter mistakes are caused by illegal parameters that clearly violate some format or rules, motivating the use of an automatic checker to detect them. 3) a significant percentage (12.2%~29.7%) of parameter-based mistakes are due to inconsistencies between different parameter values.
Issue Date:2012-05-22
Rights Information:Copyright 2012 Zuoning Yin
Date Available in IDEALS:2012-05-22
Date Deposited:2012-05

This item appears in the following Collection(s)

Item Statistics