Enhancing safety and resilience in AI-driven systems: from autonomous vehicles to data centers
Chen, Ziheng
Loading…
Permalink
https://hdl.handle.net/2142/129342
Description
Title
Enhancing safety and resilience in AI-driven systems: from autonomous vehicles to data centers
Author(s)
Chen, Ziheng
Issue Date
2025-05-08
Director of Research (if dissertation) or Advisor (if thesis)
Iyer, Ravishankar K.
Department of Study
Electrical & Computer Eng
Discipline
Electrical & Computer Engr
Degree Granting Institution
University of Illinois Urbana-Champaign
Degree Name
M.S.
Degree Level
Thesis
Keyword(s)
Dependable Systems
Autonomous Vehicles
Gpu Resilience
High-performance Computing
Language
eng
Abstract
Artificial intelligence (AI) has been widely adopted due to its advanced capabilities, finding applications in autonomous vehicles and, more recently, in large language models (LLMs) powered by transformers. As AI systems continue to expand in scope and influence, ensuring their reliability and robustness becomes increasingly critical, particularly in unforeseen scenarios where failures can have significant consequences. Furthermore, given the substantial computational and economic costs associated with training and deploying large-scale models, which often require extensive GPU clusters, the underlying infrastructure must be highly fault-tolerant to maintain efficiency and stability. This thesis presents our work on iPrism, a framework designed to enhance the safety of autonomous vehicles by reducing collision rates through AI-driven risk assessment and mitigation. This thesis also discusses the work Characterizing GPU Resilience and Impact on AI/HPC Systems, which examines GPU failure patterns in the Delta Supercomputer to improve system reliability.
Use this login method if you
don't
have an
@illinois.edu
email address.
(Oops, I do have one)
IDEALS migrated to a new platform on June 23, 2022. If you created
your account prior to this date, you will have to reset your password
using the forgot-password link below.