Withdraw
Loading…
A tiered network buffer architecture for fast networking in chiplet-based CPUs
Wang, Jerry
This item's files can only be accessed by the System Administrators group.
Permalink
https://hdl.handle.net/2142/129792
Description
- Title
- A tiered network buffer architecture for fast networking in chiplet-based CPUs
- Author(s)
- Wang, Jerry
- Issue Date
- 2025-05-09
- Director of Research (if dissertation) or Advisor (if thesis)
- Kim, Nam Sung
- Department of Study
- Electrical & Computer Eng
- Discipline
- Electrical & Computer Engr
- Degree Granting Institution
- University of Illinois Urbana-Champaign
- Degree Name
- M.S.
- Degree Level
- Thesis
- Keyword(s)
- Computer Architecture
- Computer Network
- Abstract
- To manufacture a large CPU cost-effectively, the industry has begun exploiting emerging packaging technologies that integrate multiple chiplets—each comprising a subset of cores and/or memory and I/O subsystems—into a single package. However, such a CPU experiences longer memory access latency with more pronounced variance, especially when its cores in one chiplet access LLC slices1 or DRAM controllers in other chiplets. This creates unique challenges in µs-scale networking, which is highly sensitive to memory access latency. In this work, we start by proposing exploiting a little-known mode, known as Sub-NUMA Clustering (SNC), in the latest chiplet-based CPUs. As it restricts receiving and processing packets to a particular chiplet unless explicitly specified otherwise, it offers shorter memory access latency and, consequently, lower networking latency than the default mode (non-SNC). Nonetheless, when receiving long bursts of packets2, SNC incurs higher networking latency than non-SNC, as it provides less LLC capacity for CPU cores processing the packets, making Direct Cache Access (DCA)—a commonly used CPU feature to reduce memory access latency for packet processing—ineffective. To address this drawback, we propose TiNA, a tiered network buffer architecture consisting of an enhanced NIC and networking stack, which opportunistically uses LLC slices in other chiplets for DCA only when receiving long bursts of packets. On average, TiNA reduces the mean (tail) latency by 25% (18%) and 28% (22%), compared to SNC and non-SNC, respectively, across diverse network applications and traces.
- Graduation Semester
- 2025-05
- Type of Resource
- Thesis
- Handle URL
- https://hdl.handle.net/2142/129792
- Copyright and License Information
- Copyright 2025 Tianchen-Jerry Wang
Owning Collections
Graduate Dissertations and Theses at Illinois PRIMARY
Graduate Theses and Dissertations at IllinoisManage Files
Loading…
Edit Collection Membership
Loading…
Edit Metadata
Loading…
Edit Properties
Loading…
Embargoes
Loading…