Files in this item



application/pdfGUPTA-DISSERTATION-2020.pdf (3MB)
(no description provided)PDF


Title:Sample-efficient reinforcement learning
Author(s):Gupta, Harsh
Director of Research:Srikant, Rayadurgam
Doctoral Committee Chair(s):Srikant, Rayadurgam
Doctoral Committee Member(s):Hajek, Bruce; Raginsky, Maxim; He, Niao
Department / Program:Electrical & Computer Eng
Discipline:Electrical & Computer Engr
Degree Granting Institution:University of Illinois at Urbana-Champaign
Subject(s):reinforcement learning
sample-efficient learning
stochastic approximation
Abstract:Reinforcement learning has been instrumental in the recent advances made by artificial intelligence agents in various domains. Most of these advances have been abetted by the availability of huge amounts of training data. But, in several practical applications such as those arising in wireless networks, robotics, self-driving cars etc., it is expensive and sometimes completely infeasible to collect very large amounts of data. In this work, we study four different such model-free reinforcement learning problems. The first problem we consider is the structured multi-armed bandits problem, motivated by an application in wireless networks. The second problem we consider is the bandits with two-level feedback problem, motivated by an application in panoramic video streaming. The third problem we consider is the analysis of two-time scale reinforcement learning algorithms and the final problem we consider is the analysis of the Double Q-learning algorithm. In each of these problems, our general goal is to theoretically understand the mechanics of the different moving parts in the problem and on the basis of the insights obtained from the theory, design principled practical algorithms/heuristics that are sample-efficient.
Issue Date:2020-12-02
Rights Information:Copyright 2020 Harsh Gupta
Date Available in IDEALS:2021-03-05
Date Deposited:2020-12

This item appears in the following Collection(s)

Item Statistics