Authors
Radford M Neal
Publication date
1994/3/1
Description
In this chapter, I show that priors over network parameters can be defined in such a way that the corresponding priors over functions computed by the network reach reasonable limits as the number of hidden units goes to infinity. When using such priors,there is thus no need to limit the size of the network in order to avoid “overfitting”. The infinite network limit also provides insight into the properties of different priors. A Gaussian prior for hidden-to-output weights results in a Gaussian process prior for functions,which may be smooth, Brownian, or fractional Brownian. Quite different effects can be obtained using priors based on non-Gaussian stable distributions. In networks with more than one hidden layer, a combination of Gaussian and non-Gaussian priors appears most interesting.
Total citations
1995199619971998199920002001200220032004200520062007200820092010201120122013201420152016201720182019202020212022202320242222515454541311134234919355962847752
Scholar articles
RM Neal, RM Neal - Bayesian learning for neural networks, 1996