Cache Coherence in Embedded-Ring Multiprocessors
- Cache Coherence in Embedded-Ring Multiprocessors
- Strauss, Karin
- Issue Date
- computer science
- Design complexity and limited power budget are causing the number of cores on the same chip to grow very rapidly. The wide availability of Chip Multiprocessors (CMPs) is enabling the design of inexpensive, shared-memory machines of medium size (32-128 cores). However, for machines of this size, none of the two traditional approaches to support cache coherence seems optimal. Snoopy schemes implemented with broadcast buses are difficult to efficiently scale beyond 8-32 cores. Directory-based schemes have the cost of maintaining a directory structure, as well as the fundamental latency disadvantage of adding at least one level of indirection to coherence transactions. In this work, we propose to logically embed a ring in a point-to-point network topology. Snoop messages use the logical ring, while other messages can use any link in the network. The resulting design is simple and low cost. Perhaps the main drawback of the embedded-ring approach is that snoop requests may suffer long latencies or induce many snoop messages and operations. In this work, we address these issues and, as a result, provide simple and competitive cache coherence protocol designs.
- Type of Resource
- Copyright and License Information
- You are granted permission for the non-commercial reproduction, distribution, display, and performance of this technical report in any format, BUT this permission is only for a period of 45 (forty-five) days from the most recent time that you verified that this technical report is still available from the University of Illinois at Urbana-Champaign Computer Science Department under terms that include this permission. All other rights are reserved by the author(s).
Edit Collection Membership