Continuous Neural Networks

Nicolas Le Roux, Yoshua Bengio
; Proceedings of the Eleventh International Conference on Artificial Intelligence and Statistics, PMLR 2:404-411, 2007.

Abstract

This article extends neural networks to the case of an uncountable number of hidden units, in several ways. In the first approach proposed, a finite parametrization is possible, allowing gradient-based learning. While having the same number of parameters as an ordinary neural network, its internal structure suggests that it can represent some smooth functions much more compactly. Under mild assumptions, we also find better error bounds than with ordinary neural networks. Furthermore, this parametrization may help reducing the problem of saturation of the neurons. In a second approach, the input-to-hidden weights are fully nonparametric, yielding a kernel machine for which we demonstrate a simple kernel formula. Interestingly, the resulting kernel machine can be made hyperparameter-free and still generalizes in spite of an absence of explicit regularization.

Cite this Paper


BibTeX
@InProceedings{pmlr-v2-leroux07a, title = {Continuous Neural Networks}, author = {Nicolas Le Roux and Yoshua Bengio}, booktitle = {Proceedings of the Eleventh International Conference on Artificial Intelligence and Statistics}, pages = {404--411}, year = {2007}, editor = {Marina Meila and Xiaotong Shen}, volume = {2}, series = {Proceedings of Machine Learning Research}, address = {San Juan, Puerto Rico}, month = {21--24 Mar}, publisher = {PMLR}, pdf = {http://proceedings.mlr.press/v2/leroux07a/leroux07a.pdf}, url = {http://proceedings.mlr.press/v2/leroux07a.html}, abstract = {This article extends neural networks to the case of an uncountable number of hidden units, in several ways. In the first approach proposed, a finite parametrization is possible, allowing gradient-based learning. While having the same number of parameters as an ordinary neural network, its internal structure suggests that it can represent some smooth functions much more compactly. Under mild assumptions, we also find better error bounds than with ordinary neural networks. Furthermore, this parametrization may help reducing the problem of saturation of the neurons. In a second approach, the input-to-hidden weights are fully nonparametric, yielding a kernel machine for which we demonstrate a simple kernel formula. Interestingly, the resulting kernel machine can be made hyperparameter-free and still generalizes in spite of an absence of explicit regularization.} }
Endnote
%0 Conference Paper %T Continuous Neural Networks %A Nicolas Le Roux %A Yoshua Bengio %B Proceedings of the Eleventh International Conference on Artificial Intelligence and Statistics %C Proceedings of Machine Learning Research %D 2007 %E Marina Meila %E Xiaotong Shen %F pmlr-v2-leroux07a %I PMLR %J Proceedings of Machine Learning Research %P 404--411 %U http://proceedings.mlr.press %V 2 %W PMLR %X This article extends neural networks to the case of an uncountable number of hidden units, in several ways. In the first approach proposed, a finite parametrization is possible, allowing gradient-based learning. While having the same number of parameters as an ordinary neural network, its internal structure suggests that it can represent some smooth functions much more compactly. Under mild assumptions, we also find better error bounds than with ordinary neural networks. Furthermore, this parametrization may help reducing the problem of saturation of the neurons. In a second approach, the input-to-hidden weights are fully nonparametric, yielding a kernel machine for which we demonstrate a simple kernel formula. Interestingly, the resulting kernel machine can be made hyperparameter-free and still generalizes in spite of an absence of explicit regularization.
RIS
TY - CPAPER TI - Continuous Neural Networks AU - Nicolas Le Roux AU - Yoshua Bengio BT - Proceedings of the Eleventh International Conference on Artificial Intelligence and Statistics PY - 2007/03/11 DA - 2007/03/11 ED - Marina Meila ED - Xiaotong Shen ID - pmlr-v2-leroux07a PB - PMLR SP - 404 DP - PMLR EP - 411 L1 - http://proceedings.mlr.press/v2/leroux07a/leroux07a.pdf UR - http://proceedings.mlr.press/v2/leroux07a.html AB - This article extends neural networks to the case of an uncountable number of hidden units, in several ways. In the first approach proposed, a finite parametrization is possible, allowing gradient-based learning. While having the same number of parameters as an ordinary neural network, its internal structure suggests that it can represent some smooth functions much more compactly. Under mild assumptions, we also find better error bounds than with ordinary neural networks. Furthermore, this parametrization may help reducing the problem of saturation of the neurons. In a second approach, the input-to-hidden weights are fully nonparametric, yielding a kernel machine for which we demonstrate a simple kernel formula. Interestingly, the resulting kernel machine can be made hyperparameter-free and still generalizes in spite of an absence of explicit regularization. ER -
APA
Roux, N.L. & Bengio, Y.. (2007). Continuous Neural Networks. Proceedings of the Eleventh International Conference on Artificial Intelligence and Statistics, in PMLR 2:404-411

Related Material