Meeting the Systems Challenge of Deep Learning

Seminar
Tuesday, February 25, 2020
3:30 PM to 4:30 PM
EER 3.646
Free and open to the public

Deep neural networks are pushing computer designs into new regimes of performance.  They need much more than what CPUs provide, and their demand has grown faster than the  growth in chip capability that Moore’s Law provides. Because of the compute they require, both training and inference have been a new and valuable market for the GPU.  Yet while GPUs do the job better than CPUs, the GPU is not optimized for neural networks, and new, better adapted architectures are now appearing. 

While AI-optimized architecture provides a performance boost, it alone is not enough to meet the demand.  Hardware improvements are required as well.  But even as the growth in demand for performance continues and the time to train large networks is measured in days or weeks, we see a slowing of the benefits that Moore’s Law has provided, with the end in sight.

Cerebras Systems, a venture-funded, Silicon Valley systems startup, has recently announced its success in developing a reliable, manufacturable wafer-scale chip and system aimed at training and inference in deep neural networks.   The largest chip ever made, the Cerebras Wafer-Scale Engine is 60 times larger than the largest CPU and GPU chips.  On it there are 400,000 compute cores that provide petaflops of performance, 18 gigabytes of fast SRAM memory that provide over ten petabytes of bandwidth, and a communication network with 50 petabits of bandwidth.

The talk will present the Cerebras system and discuss the technical problems concerning yield, packaging, cooling, and delivery of electrical power that had to be solved to make it possible.  We’ll also talk about programming, compilation, and the impact on deep learning of the new capabilities the Cerebras system provides.

Speaker

Rob Schreiber

Cerebras Systems

Rob Schreiber is a Distinguished Engineer at Cerebras Systems, Inc., where he works on architecture and programming of systems for accelerated training of deep neural networks. Before Cerebras he taught at Stanford and RPI and worked at NASA, at startups, and at HP.  Schreiber’s research spans sequential and parallel algorithms for matrix computation, compiler optimization for parallel languages, and high performance computer design. With Moler and Gilbert, he developed the sparse matrix extension of Matlab.  He created the NAS CG parallel benchmark.  He was a designer of the High Performance Fortran language. Rob led the development at HP of a system for synthesis of custom hardware accelerators.  He has help pioneer the exploitation of photonic signaling in processors and networks. He is an ACM Fellow, a SIAM Fellow, and was awarded, in 2012, the Career Prize from the SIAM Activity Group in Supercomputing.