Cloud-bursting and autoscaling for Python-native scientific workflows
Liu, Tingkai
Loading…
Permalink
https://hdl.handle.net/2142/120266
Description
Title
Cloud-bursting and autoscaling for Python-native scientific workflows
Author(s)
Liu, Tingkai
Issue Date
2023-04-20
Director of Research (if dissertation) or Advisor (if thesis)
Kindratenko, Volodymyr
Department of Study
Electrical & Computer Eng
Discipline
Electrical & Computer Engr
Degree Granting Institution
University of Illinois at Urbana-Champaign
Degree Name
M.S.
Degree Level
Thesis
Keyword(s)
Hpc
Cloud
Cloud-bursting
Language
eng
Abstract
In this work, the Ray framework is extended to enable automatic scaling of workloads on high-performance computing (HPC) clusters managed by SLURM and bursting to Cloud managed by Kubernetes. Compared to existing HPC-Cloud convergence solutions, this framework demonstrates advantages in several aspects: users can provide their own Cloud resource, framework provides the Python-level abstraction that does not require users to interact with job submission systems, and it allows a single Python-based parallel workload to be run concurrently across an HPC cluster and a Cloud. Applications in Electronic Design Automation and distributed Machine Learning are used to demonstrate the functionality of this solution in scaling the workload on an on-premises HPC system and automatically bursting to a public Cloud when running out of allocated HPC resources. The thesis focuses on describing the initial implementation and demonstrating novel functionality of the proposed framework, as well as identifying practical considerations and limitations for using Cloud bursting mode. The code of this framework has been open-sourced.
Use this login method if you
don't
have an
@illinois.edu
email address.
(Oops, I do have one)
IDEALS migrated to a new platform on June 23, 2022. If you created
your account prior to this date, you will have to reset your password
using the forgot-password link below.