Files in this item
Files | Description | Format |
---|---|---|
application/pdf ![]() | (no description provided) |
Description
Title: | Faster apprenticeship learning through inverse optimal control |
Author(s): | Zaytsev, Andrey |
Advisor(s): | Peng, Jian |
Department / Program: | Computer Science |
Discipline: | Computer Science |
Degree Granting Institution: | University of Illinois at Urbana-Champaign |
Degree: | M.S. |
Genre: | Thesis |
Subject(s): | Apprenticeship learning
Inverse reinforcement learning Inverse optimal control Deep learning Reinforcement learning Machine learning |
Abstract: | One of the fundamental problems of artificial intelligence is learning how to behave optimally. With applications ranging from self-driving cars to medical devices, this task is vital to modern society. There are two complementary problems in this area – reinforcement learning and inverse reinforcement learning. While reinforcement learning tries to find an optimal strategy in a given environment with known rewards for each action, inverse reinforcement learning or inverse optimal control seeks to recover rewards associated with actions given the environment and an optimal policy. Typically, apprenticeship learning is approached as a combination of these two techniques. This is an iterative process – at each step inverse reinforcement learning is applied first to get the rewards, followed by reinforcement learning to produce a guess for an optimal policy. Each guess is used in the further iterations to come up with a more accurate estimate of the reward function. While this works for problems with a small number of discreet states, the approach scales poorly. In order to mitigate those limitations, this research proposes a robust approach based on recent advances in the field of deep learning. Using the matrix formulation of inverse reinforcement learning, a reward function and an optimal policy can be recovered without having to iteratively optimize both. The approach scales well for problems with very large and continuous state spaces such as autonomous vehicle navigation. An evaluation performed using OpenAI RLLab suggests that this method is robust and ready to be adopted for solving problems both in research and various industries. |
Issue Date: | 2017-12-05 |
Type: | Text |
URI: | http://hdl.handle.net/2142/99228 |
Rights Information: | Copyright 2017 Andrey Zaytsev |
Date Available in IDEALS: | 2018-03-13 2020-03-14 |
Date Deposited: | 2017-12 |
This item appears in the following Collection(s)
-
Dissertations and Theses - Computer Science
Dissertations and Theses from the Dept. of Computer Science -
Graduate Dissertations and Theses at Illinois
Graduate Theses and Dissertations at Illinois