Architectures for machine learning and machine learning for architecture

Nam, Hyoungwook

Architectures for machine learning and machine learning for architecture

Nam, Hyoungwook

This item is only available for download by members of the University of Illinois community. Students, faculty, and staff at the U of I may log in with your NetID and password to view the item. If you are trying to access an Illinois-restricted dissertation or thesis, you can request a copy through your library's Inter-Library Loan office or purchase a copy directly from ProQuest.

Permalink

https://hdl.handle.net/2142/130094

Description

Title

Architectures for machine learning and machine learning for architecture

Author(s)

Nam, Hyoungwook

Issue Date

2025-07-10

Director of Research (if dissertation) or Advisor (if thesis)

Torrellas, Josep

Doctoral Committee Chair(s)

Torrellas, Josep

Committee Member(s)

Li, Bo
Mendis, Charith
Bose, Pradip
Pothukuchi, Raghavendra
Kim, Nam Sung

Department of Study

Siebel School Comp & Data Sci

Discipline

Computer Science

Degree Granting Institution

University of Illinois Urbana-Champaign

Degree Name

Ph.D.

Degree Level

Dissertation

Keyword(s)

Machine Learning
Computer Architecture

Language

eng

Abstract

The unprecedented success of machine learning (ML) has brought a problem and an opportunity to computer architecture research. The problem is how to build efficient and scalable computer systems for ML computations. The opportunity is how can ML methods help solving research problems in computer architecture. This dissertation explores both directions of research. In the direction of computer architecture for ML, this work focuses on two challenges for large-scale ML: scalability and power efficiency. This dissertation proposes two proposals for these: MeshSlice and PowerGrad. For the other direction, this dissertation proposes FriendlyFoe, which uses adversarial ML for hardware security. The first proposal is MeshSlice, a framework for efficient 2D tensor parallelism (TP) in distributed DNN training. MeshSlice consists of a novel 2D GeMM algorithm and an autotuner. The MeshSlice GeMM algorithm slices the collective communications into multiple partial collectives that allow overlapping communications with computations. As a result, MeshSlice hides most of the communication latency. The MeshSlice LLM autotuner automates finding the optimal configuration of 2D GeMM dataflow, the mesh shape, and the communication granularity using an analytical cost model. MeshSlice shows significant speedup in LLM training workloads compared to the state-of-the-art 2D TP method. The second proposal is PowerGrad, a gradient-based hierarchical power management framework for power-limited ML inference environments. The main idea of PowerGrad is simple: identify, at runtime, how much the performance of each workload benefits from extra power, and hierarchically shift power in the datacenter from workloads that benefit the least to those that benefit the most. In practical terms, PowerGrad dynamically computes the derivative of each compute unit’s performance over power (i.e., the performance gradient), and shifts power from lower-gradient units to higher-gradient ones. PowerGrad shows a promising result in local CPU power control, automatically achieving high power efficiency using only hardware performance counters. The final proposal is FriendlyFoe, which dynamically applies Adversarial Machine Learning (AML) to obfuscate side channels. FriendlyFoe defines a workflow to design obfuscation DNNs called Defenders with low overhead and information leakage, and to customize them for different environments. Defenders are transferable, i.e., they thwart attacker classifiers that are different from those used to train the Defenders. They also resist adaptive attacks, where attackers train using the obfuscated signals collected while the Defender is active. Finally, the approach is general enough to be applicable to different environments. FriendlyFoe is demonstrated against two side channel attacks: one based on memory contention and one on system power. FriendlyFoe is an efficient obfuscation method to defend against hardware side channels. Compared to current defenses, FriendlyFoe either 1) reduces the performance overhead with a similar level of security or 2) improves the security with a similar level of performance overhead.

Graduation Semester

2025-08

Type of Resource

Text

Handle URL

https://hdl.handle.net/2142/130094

Copyright and License Information

Owning Collections

Graduate Dissertations and Theses at Illinois PRIMARY

Graduate Theses and Dissertations at Illinois

Dissertations and Theses - Computer Science

Dissertations and Theses from the Siebel School of Computer Science

Architectures for machine learning and machine learning for architecture

Nam, Hyoungwook

Permalink

Description

Owning Collections

Graduate Dissertations and Theses at Illinois PRIMARY

Dissertations and Theses - Computer Science

Log In