View article

[PDF] from neurips.cc

On exact computation with an infinitely wide neural net

Authors

Sanjeev Arora, Simon S Du, Wei Hu, Zhiyuan Li, Ruslan Salakhutdinov, Ruosong Wang

Publication date

2019/4/26

Journal

Neurips 2019 arXiv preprint arXiv:1904.11955

Description

How well does a classic deep net architecture like AlexNet or VGG19 classify on a standard dataset such as CIFAR-10 when its “width”—namely, number of channels in convolutional layers, and number of nodes in fully-connected internal layers—is allowed to increase to infinity? Such questions have come to the forefront in the quest to theoretically understand deep learning and its mysteries about optimization and generalization. They also connect deep learning to notions such as Gaussian processes and kernels. A recent paper [Jacot et al., 2018] introduced the Neural Tangent Kernel (NTK) which captures the behavior of fully-connected deep nets in the infinite width limit trained by gradient descent; this object was implicit in some other recent papers. An attraction of such ideas is that a pure kernel-based method is used to capture the power of a fully-trained deep net of infinite width.

Total citations

Cited by 943

20192020202120222023202446 158 187 221 208 122

Scholar articles

On exact computation with an infinitely wide neural net

S Arora, SS Du, W Hu, Z Li, RR Salakhutdinov, R Wang - Advances in neural information processing systems, 2019