Online learning algorithms design with applications to clinical trial, serving system and job scheduling problem

Ruan, Yufei

Online learning algorithms design with applications to clinical trial, serving system and job scheduling problem

Ruan, Yufei

Permalink

https://hdl.handle.net/2142/129242

Description

Title

Online learning algorithms design with applications to clinical trial, serving system and job scheduling problem

Author(s)

Ruan, Yufei

Issue Date

2025-04-27

Director of Research (if dissertation) or Advisor (if thesis)

Zhou, Yuan

Doctoral Committee Chair(s)

Zhou, Yuan

Committee Member(s)

Srikant, Rayadurgam
Chen, Xin
Etesami, Rasoul

Department of Study

Industrial&Enterprise Sys Eng

Discipline

Industrial Engineering

Degree Granting Institution

University of Illinois Urbana-Champaign

Degree Name

Ph.D.

Degree Level

Dissertation

Keyword(s)

Multi-Armed Bandits
online learning

Language

eng

Abstract

This thesis presents novel algorithms and theoretical analyses for various online learning applications, including job scheduling, model selection in serving systems, and clinical trials. Each application is framed as a Multi-Armed Bandits (MAB) or Linear Bandits problem, tailored to its specific requirements. In the job scheduling application, we address the challenge of scheduling jobs on a set of machines in an online fashion with the overall quality of service as close as possible to an optimal offline benchmark. We design a variant of the Upper Confidence Bound (UCB) algorithm, achieving a logarithmic regret of O(ln T), which effectively balances exploration and exploitation in resource-constrained environments. For the serving system application, we develop a model selection algorithm to optimize the deployment of machine learning models across multiple servers in parallel. Framing this as a stochastic MAB problem, we propose UCB-based / Batch Elimination Framework algorithms for both online and offline settings. These approaches achieve regret bounds matching the full-information MAB regret of O(ln T). Simulations on an image classification task with the ImageNet dataset validate the algorithms’ efficiency and scalability. In the clinical trial application, motivated by the need to reduce the time and resource costs associated with treatment evaluation, we study the batched Linear Bandits problem. We prove that only O(log log T) batches are needed to achieve the minimax optimal regret of O( \sqrt{dTmin{log K,d}}), even in the more restricted batch learning model. Along the way, this result proposes the distributional optimal design, a natural extension of the optimal experiment design, and provide a both statistically and computationally efficient learning algorithm for the problem, which may be of independent interest.

Graduation Semester

2025-05

Type of Resource

Thesis

Handle URL

https://hdl.handle.net/2142/129242

Copyright and License Information

Owning Collections

Graduate Dissertations and Theses at Illinois PRIMARY

Graduate Theses and Dissertations at Illinois

Online learning algorithms design with applications to clinical trial, serving system and job scheduling problem

Ruan, Yufei

Permalink

Description

Owning Collections

Graduate Dissertations and Theses at Illinois PRIMARY

Dissertations and Theses - Industrial and Enterprise Systems Engineering

Log In