A tiered network buffer architecture for fast networking in chiplet-based CPUs

Wang, Jerry

A tiered network buffer architecture for fast networking in chiplet-based CPUs

Wang, Jerry

This item's files can only be accessed by the System Administrators group.

Permalink

https://hdl.handle.net/2142/129792

Description

Title

A tiered network buffer architecture for fast networking in chiplet-based CPUs

Author(s)

Wang, Jerry

Issue Date

2025-05-09

Director of Research (if dissertation) or Advisor (if thesis)

Kim, Nam Sung

Department of Study

Electrical & Computer Eng

Discipline

Electrical & Computer Engr

Degree Granting Institution

University of Illinois Urbana-Champaign

Degree Name

M.S.

Degree Level

Thesis

Keyword(s)

Computer Architecture
Computer Network

Language

eng

Abstract

To manufacture a large CPU cost-effectively, the industry has begun exploiting emerging packaging technologies that integrate multiple chiplets—each comprising a subset of cores and/or memory and I/O subsystems—into a single package. However, such a CPU experiences longer memory access latency with more pronounced variance, especially when its cores in one chiplet access LLC slices1 or DRAM controllers in other chiplets. This creates unique challenges in µs-scale networking, which is highly sensitive to memory access latency. In this work, we start by proposing exploiting a little-known mode, known as Sub-NUMA Clustering (SNC), in the latest chiplet-based CPUs. As it restricts receiving and processing packets to a particular chiplet unless explicitly specified otherwise, it offers shorter memory access latency and, consequently, lower networking latency than the default mode (non-SNC). Nonetheless, when receiving long bursts of packets2, SNC incurs higher networking latency than non-SNC, as it provides less LLC capacity for CPU cores processing the packets, making Direct Cache Access (DCA)—a commonly used CPU feature to reduce memory access latency for packet processing—ineffective. To address this drawback, we propose TiNA, a tiered network buffer architecture consisting of an enhanced NIC and networking stack, which opportunistically uses LLC slices in other chiplets for DCA only when receiving long bursts of packets. On average, TiNA reduces the mean (tail) latency by 25% (18%) and 28% (22%), compared to SNC and non-SNC, respectively, across diverse network applications and traces.

Graduation Semester

2025-05

Type of Resource

Text

Handle URL

https://hdl.handle.net/2142/129792

Copyright and License Information

Owning Collections

Graduate Dissertations and Theses at Illinois PRIMARY

Graduate Theses and Dissertations at Illinois

Dissertations and Theses - Electrical and Computer Engineering

Dissertations and Theses in Electrical and Computer Engineering

A tiered network buffer architecture for fast networking in chiplet-based CPUs

Wang, Jerry

Permalink

Description

Owning Collections

Graduate Dissertations and Theses at Illinois PRIMARY

Dissertations and Theses - Electrical and Computer Engineering

Log In