The Role of Lookahead in Reinforcement Learning Algorithms
Winnicki, Anna
Loading…
Permalink
https://hdl.handle.net/2142/124419
Description
Title
The Role of Lookahead in Reinforcement Learning Algorithms
Author(s)
Winnicki, Anna
Issue Date
2024-04-26
Director of Research (if dissertation) or Advisor (if thesis)
Srikant, R.
Doctoral Committee Chair(s)
Srikant, R.
Committee Member(s)
Hajek, Bruce
Wierman, Adam
Beck, Carolyn
Sowers, Richard
Department of Study
Electrical & Computer Eng
Discipline
Electrical & Computer Engr
Degree Granting Institution
University of Illinois at Urbana-Champaign
Degree Name
Ph.D.
Degree Level
Dissertation
Keyword(s)
Reinforcement Learning
Markov Decision Processes
Language
eng
Abstract
State of the art reinforcement learning (RL) algorithms such as AlphaZero use lookahead, which is typically implemented using Monte Carlo Tree Search (MCTS). As the name suggests, lookahead simply means looking ahead several steps when computing the policy to be used. The fact that an H-step lookahead provides an O(alpha^H), where alpha is the discount factor, approximate solution to the optimal policy is a somewhat trivial and well-known statement. What we have shown is a much stronger result: we have shown that lookahead leads to convergent learning algorithms while the same algorithms may diverge in the absence of lookahead. We have demonstrated these results for three different classes of RL algorithms: modified policy iteration with linear value function approximation [1], Monte Carlo with exploring starts [2], and policy iteration for zero-sum Markov games [3]. We have also shown that lookahead can be efficiently implemented in the widely studied class of linear MDPs [3].
Use this login method if you
don't
have an
@illinois.edu
email address.
(Oops, I do have one)
IDEALS migrated to a new platform on June 23, 2022. If you created
your account prior to this date, you will have to reset your password
using the forgot-password link below.