Files in this item



application/pdfJHA-DISSERTATION-2021.pdf (9MB)Restricted Access
(no description provided)PDF


Title:Assessing dependability of emergent large-scale autonomous systems in the wild
Author(s):Jha, Saurabh
Director of Research:Iyer, Ravishankar K.
Doctoral Committee Chair(s):Iyer, Ravishankar K.
Doctoral Committee Member(s):Hwu, Wen-mei W.; Kramer, William T.; Xu, Tianyin; Keckler, Steve
Department / Program:Computer Science
Discipline:Computer Science
Degree Granting Institution:University of Illinois at Urbana-Champaign
Subject(s):Machine Learning
Autonomous Systems
Abstract:Emergent computer systems in transportation, healthcare, and enterprise systems are increasingly adopting data-driven techniques using machine learning and artificial intelligence to automate their operation, management, and control. Their widespread use in mission-critical services that involve humans means that it is of paramount importance to provide an ever-increasing level of runtime system dependability. Dependability is a cross-cutting issue spanning the system stack, including hardware, software, and algorithms that compose the system. In addition to existing challenges, such as issues of failures, load balancing, and scalability, a significant challenge arises from the fact that these systems must make decisions in the presence of uncertainties stemming from the system (e.g., transient failures), environment/data (e.g., out-of-training distribution data), and computational models (e.g., inadequate training). An erroneous decision by the system, if not detected, will lead to silent failures and degradation that will propagate to all layers of the system, ultimately leading to catastrophic outcomes. Therefore, data-driven automation combined with the ever-increasing scale and complexity has exposed these systems to emerging failures, attacks, and performance degradation modes that are difficult to deal with using existing techniques in a dynamically evolving, multi-tenant environment. The phenomenon is exemplified by several newsworthy headlines, such as an Uber self-driving car colliding with and killing a pedestrian. This thesis develops novel data-driven methods and techniques for assuring dependability by (i) understanding the fundamental challenges to achieving system dependability that emerge due to the use of data-driven automation techniques, (ii) rigorously validating the system, including its runtime operational characteristics, and (iii) developing runtime monitoring techniques to detect, identify, and isolate events that threaten system dependability. The methods proposed in this thesis have been demonstrated on significant and broad user-inspired cases of societal importance with significantly different dependability requirements: (i) autonomous vehicles (AVs), and (ii) large-scale high-performance computing (HPC) systems.
Issue Date:2021-12-02
Rights Information:Copyright 2021 Saurabh Jha
Date Available in IDEALS:2022-04-29
Date Deposited:2021-12

This item appears in the following Collection(s)

Item Statistics