|Abstract:||Two forms of variation, namely, Spatial or Process Variation, and Temporal Variation or Aging, are becoming severe limiters of performance scaling provided by the Moore's law in the sub-45nm regime.
Process variation affects processor pipelines by making some stages slower and others faster, therefore exacerbating pipeline unbalance. This reduces the frequency attainable by the pipeline. To improve performance, we propose ReCycle, an architectural framework that comprehensively applies cycle time stealing to the pipeline - transferring the time slack of the faster stages to the slow ones by skewing clock arrival times to latching elements after fabrication. As a result, the pipeline can be clocked with a period close to the average stage delay rather than the longest one. In addition, ReCycle's frequency gains are enhanced with Donor stages, which are empty stages added to "donate" slack to the slow stages, and Forward Body Biasing (FBB).
For a 17FO4 pipeline at 45nm, ReCycle combined with Donor stages and FBB improves performance by 9%, on average, reclaiming 90% of the performance losses due to variation.
In addition to spatial variation, processors progressively age during their useful lifetime due to normal workload activity. Such aging results in gradually slower circuits. Anticipating this fact, designers add timing guardbands to processors, so that they last for a number of years. As a result, aging has important design and cost implications.
To address this problem, we show how to hide the effects of aging and slow it down. Our framework is called Facelift. It hides aging through aging-driven application scheduling. It slows down aging by applying voltage changes at key times - it uses a non-linear optimization algorithm to carefully balance the impact on the aging rate and on the critical path delays. Moreover, it can gainfully configure the chip for a short lifetime. We can take a multicore with a 7-year lifetime and, by hiding and slowing down aging, enable it to cycle, on average, at a 14 - 15% higher frequency. Alternatively, we can design a multicore for a 5 to 7-month lifetime and use it for 7 years.