Withdraw
Loading…
Learning-based scheduling for ray-based Hybrid HPC-Cloud Systems
Lu, Yicheng
Loading…
Permalink
https://hdl.handle.net/2142/124562
Description
- Title
- Learning-based scheduling for ray-based Hybrid HPC-Cloud Systems
- Author(s)
- Lu, Yicheng
- Issue Date
- 2024-04-30
- Director of Research (if dissertation) or Advisor (if thesis)
- Kindratenko, Volodymyr
- Department of Study
- Computer Science
- Discipline
- Computer Science
- Degree Granting Institution
- University of Illinois at Urbana-Champaign
- Degree Name
- M.S.
- Degree Level
- Thesis
- Keyword(s)
- Cloud Bursting
- Hpc
- Data Movement
- Scheduling
- Language
- eng
- Abstract
- Hybrid HPC-Cloud systems are becoming increasingly popular within the scientific community for their ability to efficiently manage sudden increases in demand, thus improving the processing times of HPC workloads. However, current systems lack efficient workload scheduling strategies to suit these hybrid environments and face considerable deployment challenges due to intricate configurations required, particularly concerning data transfer between HPC and the cloud. To address these issues, we have developed an innovative HPC-Cloud bursting system using Ray, a well-known open-source distributed framework. Our system adopts a learning-based scheduling approach at the function level through a dynamic label-based architecture and automatically manages data movement between the cloud and HPC. Specifically, our scheduler proactively prefetches data based on anticipated demand and analyzes patterns of data movement and task execution to inform future scheduling decisions. Our system significantly improves the processing times of HPC workloads by hiding data transfer time and employing high-quality, learning-based scheduling decisions. We evaluated our system with two different workloads: machine learning model training and image processing. We conducted performance comparisons using conventional data retrieval methods and the default Ray scheduler under various network conditions and storage configurations. Our findings consistently show that our system significantly outperforms traditional methods in every tested scenario.
- Graduation Semester
- 2024-05
- Type of Resource
- Text
- Handle URL
- https://hdl.handle.net/2142/124562
- Copyright and License Information
- Copyright 2024 Yicheng Lu
Owning Collections
Graduate Dissertations and Theses at Illinois PRIMARY
Graduate Theses and Dissertations at IllinoisDissertations and Theses - Computer Science
Dissertations and Theses from the Siebel School of Computer ScienceManage Files
Loading…
Edit Collection Membership
Loading…
Edit Metadata
Loading…
Edit Properties
Loading…
Embargoes
Loading…