Skip to main content

Making LLMs useful Teachers

Seminar

-
Location: EER 3.646
Speaker:
Abhishek Panigrahi
Princeton University

Abstract:
Training small language models requires effective distillation, yet existing methods treat 
teachers as static supervision sources. I argue that effective learning depends on what a model 
learns and when, principles that extend beyond traditional teacher-student setups.
First, I show that intermediate teacher checkpoints reveal implicit learning curriculums, 
and that aligning students to these trajectories yields provable sample-complexity 
benefits. Building on this, I develop GRACES, which predicts teacher–student 
compatibility from gradients, and STAT, which adapts supervision to a student’s weak 
skills. I show how these ideas extend beyond distillation to progressive subnetwork 
training and context-enhanced learning, pointing toward a more general theory of efficient 
learning. I outline a vision for autonomous systems that can construct their own training 
curricula.


Bio:
Abhishek is a final year graduate student in the Computer Science department at Princeton 
University, advised by Prof. Sanjeev Arora. His research focuses on understanding and 
improving generalization in deep learning models, with an emphasis on principled training 
algorithms that offer theoretical or interpretable guarantees. He is an Apple AI/ML and Siebel 
Scholar for the year 2025-26. Prior to the PhD, he was a resident at Microsoft Research 
India Lab and studied computer science as an undergraduate at IIT Kharagpur.