Special Seminar: Albert Gu, "New Structured Primitives for Machine Learning"
From Erin Klapacz
Abstract: Machine learning models are composed of simple primitives such as matrix multiplication and sequence transformations. Studying and improving these primitives is critical to advancing the capabilities of ML models: for example, the advent of powerful representations such as convolutions and self-attention led to breakthroughs in deep learning. However, existing models still have many drawbacks, including computational inefficiency and difficulty modeling long context; as a result, end-to-end models remain ineffective across a wide variety of important domains. In this talk, I will introduce several new primitives for machine learning models to address these challenges. These include new representations for structured linear algebra that allow faster matrix computations, and the HiPPO framework for encoding long context in continuous signals. I will then present the Structured State Space sequence model (S4), a new primitive that builds upon these representations to set state-of-the-art results across domains such as images, audio, and time-series. Together, these methods provide effective new building blocks for machine learning models, especially towards addressing temporal data at scale, which remains an open challenge.
Bio: Albert Gu is a Ph.D. candidate in the Department of Computer Science at Stanford University, advised by Christopher Ré. His research broadly studies structured representations for advancing the capabilities of machine learning and deep learning models, with focuses on structured linear algebra, non-Euclidean representations, and theory of sequence models. Previously, he completed a B.S. in Mathematics and Computer Science at Carnegie Mellon University, and an internship at DeepMind.