Listen

Description

Around 10 years ago, a paper came out that arguably killed classical deep learning theory: Zhang et al. 's aptly titled Understanding deep learning requires rethinking generalization.

Of course, this is a bit of an exaggeration. No single paper ever kills a field of research on its own, and deep learning theory was not exactly the most productive and healthy field at the time this was published. But if I had to point to a single paper that shattered the feeling of optimism at the time, it would be Zhang et al. 2016.[1]

Caption: believe it or not, this unassuming table rocked the field of deep learning theory back in 2016, despite probably involving fewer computational resources than what Claude 4.7 Opus consumed when I clicked the “Claude” button embedded into the LessWrong editor.



Let's start by answering a question: what, exactly, do I mean by deep learning theory?

At least in 2016, the answer was: “extending statistical learning theory to deep neural networks trained with SGD, in order to derive generalization bounds that would explain their behavior in practice”.



Since its conception in the mid 1980s, statistical learning theory had been the dominant approach for [...]

The original text contained 2 footnotes which were omitted from this narration.

---


First published:

April 25th, 2026



Source:

https://www.lesswrong.com/posts/ZvQfcLbcNHYqmvWyo/the-paper-that-killed-deep-learning-theory


---




Narrated by TYPE III AUDIO.


---

Images from the article:

Table comparing training and test accuracy of models on CIFAR10 dataset with various configurations.Stick figure comic about difficulty detecting birds in photos versus national park locations.Text excerpt discussing randomization tests and deep neural networks fitting random labels.