Authors
Sebastian Goldt, Marc Mézard, Florent Krzakala, Lenka Zdeborová
Publication date
2019/9/25
Journal
Physical Review X
Volume
10
Issue
4
Pages
041044
Description
Understanding the reasons for the success of deep neural networks trained using stochastic gradient-based methods is a key open problem for the nascent theory of deep learning. The types of data where these networks are most successful, such as images or sequences of speech, are characterized by intricate correlations. Yet, most theoretical work on neural networks does not explicitly model training data or assumes that elements of each data sample are drawn independently from some factorized probability distribution. These approaches are, thus, by construction blind to the correlation structure of real-world datasets and their impact on learning in neural networks. Here, we introduce a generative model for structured datasets that we call the hidden manifold model. The idea is to construct high-dimensional inputs that lie on a lower-dimensional manifold, with labels that depend only on their position within …
Total citations
20192020202120222023202412445614139