Continuous Neural Networks

Nicolas Le Roux, Yoshua Bengio
Proceedings of the Eleventh International Conference on Artificial Intelligence and Statistics, PMLR 2:404-411, 2007.

Abstract

This article extends neural networks to the case of an uncountable number of hidden units, in several ways. In the first approach proposed, a finite parametrization is possible, allowing gradient-based learning. While having the same number of parameters as an ordinary neural network, its internal structure suggests that it can represent some smooth functions much more compactly. Under mild assumptions, we also find better error bounds than with ordinary neural networks. Furthermore, this parametrization may help reducing the problem of saturation of the neurons. In a second approach, the input-to-hidden weights are fully nonparametric, yielding a kernel machine for which we demonstrate a simple kernel formula. Interestingly, the resulting kernel machine can be made hyperparameter-free and still generalizes in spite of an absence of explicit regularization.

Cite this Paper


BibTeX
@InProceedings{pmlr-v2-leroux07a, title = {Continuous Neural Networks}, author = {Roux, Nicolas Le and Bengio, Yoshua}, booktitle = {Proceedings of the Eleventh International Conference on Artificial Intelligence and Statistics}, pages = {404--411}, year = {2007}, editor = {Meila, Marina and Shen, Xiaotong}, volume = {2}, series = {Proceedings of Machine Learning Research}, address = {San Juan, Puerto Rico}, month = {21--24 Mar}, publisher = {PMLR}, pdf = {http://proceedings.mlr.press/v2/leroux07a/leroux07a.pdf}, url = {https://proceedings.mlr.press/v2/leroux07a.html}, abstract = {This article extends neural networks to the case of an uncountable number of hidden units, in several ways. In the first approach proposed, a finite parametrization is possible, allowing gradient-based learning. While having the same number of parameters as an ordinary neural network, its internal structure suggests that it can represent some smooth functions much more compactly. Under mild assumptions, we also find better error bounds than with ordinary neural networks. Furthermore, this parametrization may help reducing the problem of saturation of the neurons. In a second approach, the input-to-hidden weights are fully nonparametric, yielding a kernel machine for which we demonstrate a simple kernel formula. Interestingly, the resulting kernel machine can be made hyperparameter-free and still generalizes in spite of an absence of explicit regularization.} }
Endnote
%0 Conference Paper %T Continuous Neural Networks %A Nicolas Le Roux %A Yoshua Bengio %B Proceedings of the Eleventh International Conference on Artificial Intelligence and Statistics %C Proceedings of Machine Learning Research %D 2007 %E Marina Meila %E Xiaotong Shen %F pmlr-v2-leroux07a %I PMLR %P 404--411 %U https://proceedings.mlr.press/v2/leroux07a.html %V 2 %X This article extends neural networks to the case of an uncountable number of hidden units, in several ways. In the first approach proposed, a finite parametrization is possible, allowing gradient-based learning. While having the same number of parameters as an ordinary neural network, its internal structure suggests that it can represent some smooth functions much more compactly. Under mild assumptions, we also find better error bounds than with ordinary neural networks. Furthermore, this parametrization may help reducing the problem of saturation of the neurons. In a second approach, the input-to-hidden weights are fully nonparametric, yielding a kernel machine for which we demonstrate a simple kernel formula. Interestingly, the resulting kernel machine can be made hyperparameter-free and still generalizes in spite of an absence of explicit regularization.
RIS
TY - CPAPER TI - Continuous Neural Networks AU - Nicolas Le Roux AU - Yoshua Bengio BT - Proceedings of the Eleventh International Conference on Artificial Intelligence and Statistics DA - 2007/03/11 ED - Marina Meila ED - Xiaotong Shen ID - pmlr-v2-leroux07a PB - PMLR DP - Proceedings of Machine Learning Research VL - 2 SP - 404 EP - 411 L1 - http://proceedings.mlr.press/v2/leroux07a/leroux07a.pdf UR - https://proceedings.mlr.press/v2/leroux07a.html AB - This article extends neural networks to the case of an uncountable number of hidden units, in several ways. In the first approach proposed, a finite parametrization is possible, allowing gradient-based learning. While having the same number of parameters as an ordinary neural network, its internal structure suggests that it can represent some smooth functions much more compactly. Under mild assumptions, we also find better error bounds than with ordinary neural networks. Furthermore, this parametrization may help reducing the problem of saturation of the neurons. In a second approach, the input-to-hidden weights are fully nonparametric, yielding a kernel machine for which we demonstrate a simple kernel formula. Interestingly, the resulting kernel machine can be made hyperparameter-free and still generalizes in spite of an absence of explicit regularization. ER -
APA
Roux, N.L. & Bengio, Y.. (2007). Continuous Neural Networks. Proceedings of the Eleventh International Conference on Artificial Intelligence and Statistics, in Proceedings of Machine Learning Research 2:404-411 Available from https://proceedings.mlr.press/v2/leroux07a.html.

Related Material