Conventional look-ahead architecture has been a powerful driver to deliver the single-thread performance that we see today. However, with frequency scaling plateaued, single-thread performance improvement has significantly slowed. With the prevalence of multicore architecture in the marketplace and the increasing core count, some believe single-thread performance is no longer relevant and that everybody *has* to start parallel programming. If only that is the solution! It is probably safe to say that we are entering an era where cheap, effective performance tricks are becoming rare and we have to increasingly diversify the source of performance improvement. Nevertheless, there is no evidence that implicit parallelism has been exhausted. We believe that with more effort, we can still find significant performance enhancement opportunities. And exploiting implicit parallelism through look-ahead is still a valid principle. Conventional look-ahead implementation is the product of a lengthy evolution and is not the only way of carrying out look-ahead. We have been experimenting a more decoupled design of the look-ahead logic. From what we learned, it appears to us that practical implementations that offer tangible performance benefits at reasonable energy costs are achievable. In this talk, I will discuss some specific ideas and point to some future directions that we are currently exploring.
Monday, March 19, 2012
Free and open to the public