Files in this item
Files  Description  Format 

application/pdf Golshid_Baharian Khoshkhou.pdf (773kB)  (no description provided) 
Description
Title:  Stochastic sequential assignment problem 
Author(s):  Baharian Khoshkhou, Golshid 
Director of Research:  Jacobson, Sheldon H. 
Doctoral Committee Chair(s):  Chen, Xin 
Doctoral Committee Member(s):  Jacobson, Sheldon H.; Kiyavash, Negar; Shanbhag, Vinayak V. 
Department / Program:  Industrial&Enterprise Sys Eng 
Discipline:  Industrial Engineering 
Degree Granting Institution:  University of Illinois at UrbanaChampaign 
Degree:  Ph.D. 
Genre:  Dissertation 
Subject(s):  sequential assignment
Markov decision process stationary policy hidden Markov model threshold criteria risk measure 
Abstract:  The stochastic sequential assignment problem (SSAP) studies the allocation of available distinct workers with deterministic values to sequentiallyarriving tasks with stochastic parameters so as to maximize the expected total reward obtained from the assignments. The difficulty and challenge in making the assignment decisions is that the assignments are performed in realtime; specifically, pairing a worker with a task is done without knowledge of future task values. This thesis focuses on studying practical variations and extensions of the SSAP, with the goal of eliminating restricting assumptions so that the problem setting converges to that of realworld problems. The existing SSAP literature considers a riskneutral objective function, seeking an assignment policy to maximize the expected total reward; however, a riskneutral objective function is not always desirable for the decisionmaker since the probability distribution function (pdf) of the total reward might carry a high probability of low values. To take this issue into account, the first part of this dissertation studies the SSAP under a risksensitive objective function. Specifically, the assignments are performed so as to minimize the threshold probability, which is the probability of the total reward failing to achieve a specified target (threshold). A targetdependent Markov decision process (MDP) is solved, and sufficient conditions for the existence of a deterministic Markov optimal policy are provided. An approximate algorithm is presented, and convergence of the approximate value function to the optimal value function is established under mild conditions. The second part of this thesis analyzes the limiting behavior of the SSAP as the number of assignments approaches infinity. The optimal assignment policy for the basic SSAP has a threshold structure and involves computing a new set of breakpoints upon the arrival of each task, which is cumbersome for largescale problems. To address this issue, the second part of this dissertation focuses on obtaining stationary (timeindependent) optimal assignment policies that maximize the longrun expected reward per task and are much easier to perform in realworld problems. An exponential convergence rate is established for the convergence of the expected total reward per task to the optimal value as the number of tasks approaches infinity. The limiting behavior of the SSAP is studied in two different settings. The first setting assumes an independent and identically distributed (IID) sequence of arriving tasks with observable distribution functions, while the second problem considers the case where task distributions are unobservable and belong to a pool of feasible distributions. The next part of this dissertation basically brings the first two parts together, studying the limiting behavior of the targetdependent SSAP, where the goal is finding an assignment policy that minimizes the probability of the longrun reward per task failing to achieve a given target value. It is proven that the abovementioned stationary policy (mentioned in the previous paragraph), which achieves the longrun expected reward per task, minimizes the longrun threshold probability in this problem as well. These two objective functions being optimized simultaneously by one assignment policy is interesting, since the threshold criteria, by definition, deviates from the expected total reward criteria; i.e., although it attempts to avoid hitting below a given target level as much as possible, it does not automatically and necessarily guarantee a reasonable performance in terms of the expected reward. Finally, stochasticity in the SSAP is extended to worker values in the last part of this thesis, where the worker values are assumed to be random variables, taking on new values upon each task arrival. Four models are introduced which analyze this problem from different aspects; e.g., the distribution function of worker values being identical or distinct, the worker values in a given time period being dependent on those in the preceding time periods (or within the same time period), worker values being deterministic but assumed to be functions of time (possibly deteriorating with time), and task values being independent or dependent on each other. For each of these models an optimal assignment policy is presented so as to maximize the expected total reward. 
Issue Date:  20140916 
URI:  http://hdl.handle.net/2142/50503 
Rights Information:  Copyright 2014 Golshid Baharian Khoshkhou 
Date Available in IDEALS:  20140916 
Date Deposited:  201408 
This item appears in the following Collection(s)

Dissertations and Theses  Industrial and Enterprise Systems Engineering

Graduate Dissertations and Theses at Illinois
Graduate Theses and Dissertations at Illinois