Files in this item
Files  Description  Format 

application/pdf JIANGDISSERTATION2015.pdf (2MB)  (no description provided) 
Description
Title:  On the resolution of misspecification in stochastic optimization, variational inequality, and gametheoretic problems 
Author(s):  Jiang, Hao 
Director of Research:  Shanbhag, Uday 
Doctoral Committee Chair(s):  Nedich, Angelia 
Doctoral Committee Member(s):  Meyn, Sean; Beck, Carolyn 
Department / Program:  Industrial & Enterprise Systems Engineering 
Discipline:  Industrial Engineering 
Degree Granting Institution:  University of Illinois at UrbanaChampaign 
Degree:  Ph.D. 
Genre:  Dissertation 
Subject(s):  Misspecified Stochastic Optimization
Misspecified Variational Inequality Misspecified Stochastic Nash Games Misspecified Markov Decision Processes 
Abstract:  Traditionally, much of the research in the field of optimization algorithms has assumed that problem parameters are correctly specified. Recent efforts under the robust optimization framework have relaxed this assumption by allowing unknown parameters to vary in a prescribed uncertainty set and by subsequently solving for a worstcase solution. This dissertation considers a rather different approach in which the unknown or misspecified parameter is a solution to a suitably defined (stochastic) learning problem based on having access to a set of samples. Practical approaches in resolving such a set of coupled problems have been either sequential or direct variational approaches. In the case of the former, this entails the following steps: (i) a solution to the learning problem for parameters is first obtained; and (ii) a solution is obtained to the associated parametrized computational problem by using (i). Such avenues prove difficult to adopt particularly since the learning process has to be terminated finitely and consequently, in largescale or stochastic instances, sequential approaches may often be corrupted by error. On the other hand, a variational approach requires that the problem may be recast as a possibly nonmonotone stochastic variational inequality problem; but there are no known firstorder (stochastic) schemes currently available for the solution of such problems. Motivated by these challenges, this thesis focuses on studying joint schemes of optimization and learning in three settings: (i) misspecified stochastic optimization and variational inequality problems, (ii) misspecified stochastic Nash games, (iii) misspecified Markov decision processes. In the first part of this thesis, we present a coupled stochastic approximation scheme which simultaneously solves both the optimization and the learning problems. The obtained schemes are shown to be equipped with almost sure convergence properties in regimes when the function $f$ is either strongly convex as well as merely convex. Importantly, the scheme displays the optimal rate for strongly convex problems while in merely convex regimes, through an averaging approach, we quantify the degradation associated with learning by noting that the error in function value after $K$ steps is $O(\sqrt{\ln(K)/K})$, rather than $O(\sqrt{1/K})$ when $\theta^*$ is available. Notably, when the averaging window is modified suitably, it can be see that the original rate of $O(\sqrt{1/K})$ is recovered. Additionally, we consider an online counterpart of the misspecified optimization problem and provide a nonasymptotic bound on the average regret with respect to an offline counterpart. We also extend these statements to a class of stochastic variational inequality problems, an object that unifies stochastic convex optimization problems and a range of stochastic equilibrium problems. Analogous almostsure convergence statements are provided in strongly monotone and merely monotone regimes, the latter facilitated by using an iterative Tikhonov regularization. In the merely monotone regime, under a weaksharpness requirement, we quantify the degradation associated with learning and show that expected error associated with $dist(x_k,X^*)$ is $O(\sqrt{\ln(K)/K})$. In the second part of this thesis, we present schemes for computing equilibria to two classes of convex stochastic Nash games complicated by a parametric misspecification, a natural concern in the control of large scale networked engineered system. In both schemes, players learn the equilibrium strategy while resolving the misspecification: (1) Stochastic Nash games: We present a set of coupled stochastic approximation distributed schemes distributed across agents in which the first scheme updates each agent’s strategy via a projected (stochastic) gradient step while the second scheme updates every agent’s belief regarding its misspecified parameter using an independently specified learning problem. We proceed to show that the produced sequences converge to the true equilibrium strategy and the true parameter in an almost sure sense. Surprisingly, convergence in the equilibrium strategy achieves the optimal rate of convergence in a meansquared sense with a quantifiable degradation in the rate constant; (2) Stochastic NashCournot games with unobservable aggregate output: We refine (1) to a Cournot setting where we assume that the tuple of strategies is unobservable while payoff functions and strategy sets are public knowledge through a common knowledge assumption. By utilizing observations of noisecorrupted prices, iterative fixedpoint schemes are developed, allowing for simultaneously learning the equilibrium strategies and the misspecified parameter in an almostsure sense. In the third part of this thesis, we consider the solution of a finitestate infinite horizon Markov Decision Process (MDP) in which both the transition matrix and the cost function are misspecified, the latter in a parametric sense. We consider a datadriven regime in which the learning problem is a stochastic convex optimization problem that resolves misspecification. Via such a framework, we make the following contributions: (1) We first show that a misspecified value iteration scheme converges almost surely to its true counterpart and the meansquared error after $K$ iterations is $O(\sqrt{1/K})$; (2) An analogous asymptotic almostsure convergence statement is provided for misspecified policy iteration; and (3) Finally, we present a constant steplength misspecified Qlearning scheme and show that a suitable error metric is $O(\sqrt{1/K})$ + $O(\sqrt{δ})$ after K iterations where δ is a bound on the steplength. 
Issue Date:  20151202 
Type:  Text 
URI:  http://hdl.handle.net/2142/89033 
Rights Information: 
Copyright 2015 Hao Jiang 
Date Available in IDEALS:  20160302 
Date Deposited:  201512 
This item appears in the following Collection(s)

Dissertations and Theses  Industrial and Enterprise Systems Engineering

Graduate Dissertations and Theses at Illinois
Graduate Theses and Dissertations at Illinois