Equilibrium approaches to deep learning: One (implicit) layer is all you need, Zico Kolter; IDS2 seminar series
From Oluwasanmi Koyejo on 11/6/2020
Abstract: Does deep learning actually need to be deep? In this talk, I will present some of our recent and ongoing work on Deep Equilibrium (DEQ) Models, an approach that demonstrates we can achieve most of the benefits of modern deep learning systems using very shallow models, but ones which are defined implicitly via finding a fixed point of a nonlinear dynamical system. I will show that these methods can achieve results on par with the state of the art in domains spanning large-scale language modeling, image classification, and semantic segmentation, while requiring less memory and simplifying architectures substantially. I will also highlight some recent work analyzing the theoretical properties of these systems, where we show that certain classes of DEQ models are guaranteed to have a unique fixed point, easily-controlled Lipschitz constants, and efficient algorithms for finding the equilibria. I will conclude by discussing ongoing work and future directions for these classes of models.