Withdraw
Loading…
Passivity, no-regret, and performance in online learning and games
Abdelraouf, Hassan
This item's files can only be accessed by the System Administrators group.
Permalink
https://hdl.handle.net/2142/132762
Description
- Title
- Passivity, no-regret, and performance in online learning and games
- Author(s)
- Abdelraouf, Hassan
- Issue Date
- 2025-11-26
- Director of Research (if dissertation) or Advisor (if thesis)
- Shamma, Jeff
- Doctoral Committee Chair(s)
- Langbort, Cedric
- Committee Member(s)
- Dullerud, Geir
- Tsukamoto, Hiroyasu
- Department of Study
- Aerospace Engineering
- Discipline
- Aerospace Engineering
- Degree Granting Institution
- University of Illinois Urbana-Champaign
- Degree Name
- Ph.D.
- Degree Level
- Dissertation
- Keyword(s)
- Passivity No-regret Online Learning
- Abstract
- As autonomous AI agents become more widely deployed across dynamic, multi-agent environments, they will continuously learn and interact in real time to achieve complex goals. This thesis develops a control- and game-theoretic foundation to analyze and ultimately synthesize such systems in which adaptive agents evolve in the presence of other adaptive agents. Building on this motivation, the thesis investigates the interplay between passivity, no-regret and performance of continuous-time learning dynamics. The analysis is divided into two parts: (i) the interaction between a learning model and a dynamic, uncertain environment, and (ii) the interaction among multiple adaptive learners within a game. In the first part, the learning dynamic model is viewed as an input–output operator that maps the payoffs to strategies. Building on prior work for replicator dynamics, we show that if the learning dynamic model satisfies a passivity condition between the payoff vector and the deviation of its evolving strategy from any fixed strategy, it achieves finite regret. We then prove that this passivity condition holds for strategic higher-order variants of learning dynamics that have finite regret. We further provide numerical examples to illustrate the lack of finite regret of different evolutionary dynamic models that violate the passivity property. We also examine the fragility of the finite regret property under payoff perturbations. This raises an important question: is finite regret, by itself, a sufficient metric to assess the quality of the learning dynamic models , or should additional performance measures be considered? Motivated by this consideration, the thesis addresses the ``free-lunch'' question in no-regret learning- whether one no-regret algorithm outperform another in asymptotic average reward- so that an agent incurs regret for not having chosen a particular no-regret algorithm. We develop a control-theoretic lens in which a learning dynamic model is modeled as a cascade interconnection between a diagonal LTI map $G(s)=g(s)I_n$ and the softmax nonlinearity, linking the frequency response $g(j\omega)$ (gain and phase) directly to asymptotic performance. We introduce payoff-based higher-order variants of replicator dynamics, anticipatory/predictive replicator dynamics, and show that the anticipatory model is dynamically equivalent to predictive replicator dynamics with a first-order low-pass predictor. An oracle (perfect-prediction) variant is proved to uniformly dominate the standard replicator dynamics, i.e., it achieves higher cumulative reward at every time horizon, across all environments. Using passivity, we cast the performance comparison as a passivity question: passivity of an associated comparison system is equivalent to uniform dominance of one learning algorithm over another. This yields several free-lunch results: predictive exponential replicator dynamics with a low-pass predictor uniformly dominates the standard exponential replicator dynamics for any payoff trajectory; moreover, any predictive replicator with a passive, asymptotically stable predictor, including anticipatory replicator dynamics, locally dominates the standard replicator. Framing the global comparison between anticipatory and standard replicator as an optimal-control problem, we show the minimal achievable performance gap is zero, implying uniform dominance of the anticipatory model across all environments. Lastly, we derive closed-form expressions for the long-run average reward and limiting strategy of replicator dynamics in arbitrary $2\pi$-periodic environments. In the second part, the focus shifts from the interaction of a single learner with a dynamic environment to the interaction among multiple learners within a game. We establish a connection between finite regret and equilibrium-independent passivity (EI–passivity) through Best–Response Stationarity (BRS). Modeling the interaction between a learning dynamic (mapping payoffs to strategies) and a game (mapping strategies to payoffs) as a feedback interconnection, we exploit the fact that contractive games are anti–incrementally passive to show that incremental passivity is a stronger notion that implies both $\delta$–passivity and EI–passivity. Based on this connection, we develop a passivity-based classification of learning dynamics according to the passivity notion they satisfy—namely, incremental passivity, $\delta$–passivity, and EI–passivity—and use this classification as a framework for convergence analysis in contractive games. More generally, we develop an incremental-stability analysis for payoff-based higher-order variants of replicator dynamics in matrix contractive games. Taken together, the results of this thesis provide a unified control-theoretic framework for analyzing and comparing the performance of online learning dynamics. Beyond the theoretical significance, these results bridge control theory, online learning, and game theory, offering concepts that can guide the design of stable, efficient, and robust autonomous learning systems operating in interactive, uncertain, and multi-agent environments.
- Graduation Semester
- 2025-12
- Type of Resource
- Thesis
- Handle URL
- https://hdl.handle.net/2142/132762
- Copyright and License Information
- © 2025 Hassan Abdelraouf All rights reserved.
Owning Collections
Graduate Dissertations and Theses at Illinois PRIMARY
Graduate Theses and Dissertations at IllinoisManage Files
Loading…
Edit Collection Membership
Loading…
Edit Metadata
Loading…
Edit Properties
Loading…
Embargoes
Loading…