Files in this item



application/pdfTHANGEDA-THESIS-2020.pdf (3MB)Restricted to U of Illinois
(no description provided)PDF


Title:Efficient learning and planning using spatial side information
Author(s):Thangeda, Pranay
Advisor(s):Ornik, Melkior
Department / Program:Aerospace Engineering
Discipline:Aerospace Engineering
Degree Granting Institution:University of Illinois at Urbana-Champaign
Subject(s):Autonomous Systems
Reinforcement Learning
Optimal Planning
Transit Networks
Abstract:This thesis investigates the following question: how to efficiently integrate side information, available either a priori or online, with existing algorithms for learning and planning in environments with stochastic features? Side information in this context refers to any information that does not directly determine system parameters, but indicates a relationship between them. Such information can often be obtained from existing data, including that collected by onboard sensors. Algorithms that exploit side information are of interest in solving many real-world problems that can be modeled as stochastic control processes with unknown transition probabilities or unknown transition times. Specifically, we consider the problems of reward maximization in grid-world environments with unknown, stochastic dynamics and travel time minimization in urban transit routing problems with deterministic dynamics and stochastic travel times. Exploiting additional information available to solve these problems, when classical algorithms leave much to be desired in terms of performance and accuracy, is the main theme of this thesis. The first part of the thesis proposes the idea of indirect sampling for accelerated learning in Markov decision processes when additional information is available in the form of bounds on the differences between the transition probabilities at different states. In addition, it proposes a greedy approximation algorithm that utilizes the additional side information to effectively balance exploration and exploitation. It also analyzes the performance of indirect sampling algorithms in different information settings and defines the notion of agent safety, a vital consideration for systems operating in the physical environment, in the context of our problem. Under certain assumptions, it provides guarantees on the safety of an agent exploring with our algorithm that exploits side information. The second part proposes a methodology and a tool that, given an origin-destination pair, a travel time budget, and a measure of the passenger's tolerance for ambiguity, provide the optimal online route choice in a transit network by balancing the objectives of maximizing on-time arrival probability and minimizing expected travel time. This framework is a significant improvement over existing algorithms where the problem of optimal routing in urban transit networks is usually studied with only the least expected travel time as the performance criteria under the assumption of travel time independence on different road segments. The proposed algorithm utilizes side information, available in the form of historic travel time data and upstream real-time data, to build and update the underlying model online. We demonstrate the utility and the performance of the proposed algorithms with the help of realistic numerical experiments conducted (i) on a fixed-route bus system that serves the residents of the Champaign-Urbana metropolitan area and, (ii) in the setting of a Mars rover navigating on unknown or partially known terrain. In both of these problems, data from onboard sensors and external sources acts as the side information.
Issue Date:2020-12-09
Rights Information:Copyright 2020 Pranay Thangeda
Date Available in IDEALS:2021-03-05
Date Deposited:2020-12

This item appears in the following Collection(s)

Item Statistics