Files in this item

FilesDescriptionFormat

application/pdf

application/pdfTANG-THESIS-2020.pdf (3MB)Restricted to U of Illinois
(no description provided)PDF

Description

Title:Robust imitation learning from observation
Author(s):Tang, Zhenyi
Advisor(s):Driggs-Campbell, Katherine
Department / Program:Computer Science
Discipline:Computer Science
Degree Granting Institution:University of Illinois at Urbana-Champaign
Degree:M.S.
Genre:Thesis
Subject(s):Imitation learning, imitation learning from observation, robustness
Abstract:Imitation learning, sometimes referred as learning from demonstrations, has been used in real world scenarios because of its sample efficiency and computational feasibility, such as autonomous driving and robotics control. However, imitation learning often suffers from compounding error and data mismatch, which leads to lack of robustness. Another drawback is that in traditional imitation learning, people usually assume that data for both states and actions is accessible. In reality, data about the action experts took may be more difficult to access than the data about state transitions. For example, a driving video clip shows each states (traffic signal, road condition, map navigation, etc.) the vehicle is in, but does not contain associated information about whether the driver steers left or right in this transition. To address these two issues, we propose an algorithm called Robust Imitation Learning from Observation (RILfO), that aims to provide robustness in an imitation learning from observation setting. First, we allow the agent to learn a policy given state-only demonstrations from experts. Second, we introduce an adversarial agent that aims to optimally destabilize the system by carefully engineering its loss function. We jointly train the agent and adversary so that the adversary is reinforced, and the agent explores more possibilities, thus becomes more robust to the various adversarial conditions. We experimentally test RILfO in multiple benchmark environments, compare RILfO with some baseline methods, demonstrate its robustness. We also discuss about its limitations and opportunities for future work.
Issue Date:2020-05-11
Type:Thesis
URI:http://hdl.handle.net/2142/108132
Rights Information:Copyright 2020 Zhenyi Tang
Date Available in IDEALS:2020-08-26
Date Deposited:2020-05


This item appears in the following Collection(s)

Item Statistics