Director of Research (if dissertation) or Advisor (if thesis)
Doctoral Committee Chair(s)
Department of Study
Electrical & Computer Eng
Electrical & Computer Engr
Degree Granting Institution
University of Illinois at Urbana-Champaign
Reinforcement learning has been instrumental in the recent advances made by artificial intelligence agents in various domains. Most of these advances have been abetted by the availability of huge amounts of training data. But, in several practical applications such as those arising in wireless networks, robotics, self-driving cars etc., it is expensive and sometimes completely infeasible to collect very large amounts of data. In this work, we study four different such model-free reinforcement learning problems. The first problem we consider is the structured multi-armed bandits problem, motivated by an application in wireless networks. The second problem we consider is the bandits with two-level feedback problem, motivated by an application in panoramic video streaming. The third problem we consider is the analysis of two-time scale reinforcement learning algorithms and the final problem we consider is the analysis of the Double Q-learning algorithm. In each of these problems, our general goal is to theoretically understand the mechanics of the different moving parts in the problem and on the basis of the insights obtained from the theory, design principled practical algorithms/heuristics that are sample-efficient.