Withdraw
Loading…
Adaptive runtime techniques for node-aware resource optimization
Chandrasekar, Kavitha
Loading…
Permalink
https://hdl.handle.net/2142/129286
Description
- Title
- Adaptive runtime techniques for node-aware resource optimization
- Author(s)
- Chandrasekar, Kavitha
- Issue Date
- 2025-04-30
- Director of Research (if dissertation) or Advisor (if thesis)
- Kale, Laxmikant
- Doctoral Committee Chair(s)
- Kale, Laxmikant
- Committee Member(s)
- Torrellas, Josep
- Amato, Nancy
- Rountree, Barry
- Department of Study
- Computer Science
- Discipline
- Computer Science
- Degree Granting Institution
- University of Illinois Urbana-Champaign
- Degree Name
- Ph.D.
- Degree Level
- Dissertation
- Keyword(s)
- High Performance Computing
- Node-aware resource tuning
- Message aggregation
- Load balancing
- Abstract
- Adaptive runtime systems have to deal with an increasing number of cores per node. Adaptive algorithms that worked well when the number of cores per node was smaller now run into many challenges. This dissertation is aimed at addressing these challenges along multiple frontiers. The first challenge is simply about how many cores on each node to utilize for a given application. We analyzed this issue from the point of view of performance as well as energy saving, and demonstrate adaptive runtime techniques for tuning the number of cores used during execution. Secondly, many of the issues discussed here are meaningful mainly when one is using a shared-memory process spanning multiple cores of a node, rather than a "Charm++ everywhere" (analogous to MPI everywhere) mode, where each process runs on a core of its own. The latter is fundamentally sub-optimal, because it precludes resource and data-structure sharing. However, the former suffers from performance issues, especially in communication, because it is harder to expose and exploit parallelism in the network interface usage. We address this issue with comprehensive analysis and communication layer design driven by such analysis to demonstrate on-par communication performance and/or identify additional issues that need to be dealt with. Thirdly, adaptive runtime systems often have to deal with fine-grained communication. This necessitates adaptive aggregation of many short messages. However, large multicore nodes complicate this aggregation functionality, along with the diverse needs of applications. We develop metrics for characterizing the use-cases, and develop adaptive algorithms that work well on multicore nodes and for varied use cases. Fourthly, dynamic load balancing is another challenge: migrating work units to cores to effectively balance load in dynamic applications needs to take into account the multi-level nature of communication, while also addressing scalability issues created by the huge number of cores on modern machines. We demonstrate node aware, and communication aware strategies, including refinement strategies that aim at minimal migration and low load balancing overhead.
- Graduation Semester
- 2025-05
- Type of Resource
- Thesis
- Handle URL
- https://hdl.handle.net/2142/129286
- Copyright and License Information
- Copyright 2025 Kavitha Chandrasekar
Owning Collections
Graduate Dissertations and Theses at Illinois PRIMARY
Graduate Theses and Dissertations at IllinoisManage Files
Loading…
Edit Collection Membership
Loading…
Edit Metadata
Loading…
Edit Properties
Loading…
Embargoes
Loading…