Skip to main content

Resource-efficient AI System Design

Computer Architecture Seminar

-
Location: EER 3.646
Speaker:
Ana Klimovic
ETH Zurich
Abstract: Today’s large-scale AI model training and serving jobs require many hardware accelerators to run, making these jobs extremely costly and power-hungry. Yet despite requiring many GPUs to run, AI jobs often underutilize individual GPUs for a variety of reasons, including data preprocessing stalls, communication stalls, low batching opportunities, and imbalanced memory and compute usage of individual operators within a job. This inefficient use of hardware accelerators further increases costs. In this talk, we will discuss why optimizing hardware accelerator (e.g., GPU) utilization is key to improving the cost and energy efficiency of AI workloads and how we can achieve this. To avoid communication stalls, I will present SAILOR, an elastic AI training framework that co-optimizing the cluster topology and job parallelization plan dynamically as resource availability varies over time. To improve batching opportunities for multi-variant model serving, I will present DeltaZip, a serving platform that leverages the key insight that model deltas can be aggresively compressed to improve serving and swapping latency while maintaining high accuracy. Finally, to address the imbalanced memory and compute utilization of GPUs within individual jobs, I will present Orion, an interference-aware GPU scheduler that schedules tasks from collocated worklaods at the fine granularity of individual GPU kernels to maximize GPU utilization while maintaining high performance.
 
Bio: Ana Klimovic is an Assistant Professor in the Systems Group of the Computer Science Department at ETH Zurich. Her research interests span operating systems, computer architecture, and their intersection with machine learning. Ana's work focuses on computer system design for large-scale applications such as cloud computing services, data analytics, and machine learning. Before joining ETH in August 2020, Ana was a Research Scientist at Google Brain and completed her Ph.D. in Electrical Engineering at Stanford University.
Seminar Series