Deep neural networks are pushing computer designs into new regimes of performance. They need much more than what CPUs provide, and their demand has grown faster than the growth in chip capability that Moore’s Law provides. Because of the compute they require, both training and inference have been a new and valuable market for the GPU. Yet while GPUs do the job better than CPUs, the GPU is not optimized for neural networks, and new, better adapted architectures are now appearing.
While AI-optimized architecture provides a performance boost, it alone is not enough to meet the demand. Hardware improvements are required as well. But even as the growth in demand for performance continues and the time to train large networks is measured in days or weeks, we see a slowing of the benefits that Moore’s Law has provided, with the end in sight.
Cerebras Systems, a venture-funded, Silicon Valley systems startup, has recently announced its success in developing a reliable, manufacturable wafer-scale chip and system aimed at training and inference in deep neural networks. The largest chip ever made, the Cerebras Wafer-Scale Engine is 60 times larger than the largest CPU and GPU chips. On it there are 400,000 compute cores that provide petaflops of performance, 18 gigabytes of fast SRAM memory that provide over ten petabytes of bandwidth, and a communication network with 50 petabits of bandwidth.
The talk will present the Cerebras system and discuss the technical problems concerning yield, packaging, cooling, and delivery of electrical power that had to be solved to make it possible. We’ll also talk about programming, compilation, and the impact on deep learning of the new capabilities the Cerebras system provides.