Authors
Dazhi Zhao, Guozhu Yu, Peng Xu, Maokang Luo
Publication date
2019
Journal
Neural Networks
Volume
115
Pages
82-89
Publisher
ELSEVIER
Description
The great achievements of deep learning can be attributed to its tremendous power of feature representation, where the representation ability comes from the nonlinear activation function and the large number of network nodes. However, deep neural networks suffer from serious issues such as slow convergence, and dropout is an outstanding method to improve the network’s generalization ability and test performance. Many explanations have been given for why dropout works so well, among which the equivalence between dropout and data augmentation is a newly proposed and stimulating explanation. In this article, we discuss the exact conditions for this equivalence to hold. Our main result guarantees that the equivalence relation almost surely holds if the dimension of the input space is equal to or higher than that of the output space. Furthermore, if the commonly used rectified linear unit activation function is …
Total citations
201920202021202220232024267862