Quick Training of Probabilistic Neural Nets by Importance Sampling

Yoshua Bengio, Jean-Sébastien Senecal
Proceedings of the Ninth International Workshop on Artificial Intelligence and Statistics, PMLR R4:17-24, 2003.

Abstract

Our previous work on statistical language modeling introduced the use of probabilistic feedforward neural networks to help dealing with the curse of dimensionality. Training this model by maximum likelihood however requires for each example to perform as many network passes as there are words in the vocabulary. Inspired by the contrastive divergence model, we propose and evaluate sampling-based methods which require network passes only for the observed "positive example" and a few sampled negative example words. A very significant speed-up is obtained with an adaptive importance sampling.

Cite this Paper


BibTeX
@InProceedings{pmlr-vR4-bengio03a, title = {Quick Training of Probabilistic Neural Nets by Importance Sampling}, author = {Bengio, Yoshua and Senecal, Jean-S{\'{e}}bastien}, booktitle = {Proceedings of the Ninth International Workshop on Artificial Intelligence and Statistics}, pages = {17--24}, year = {2003}, editor = {Bishop, Christopher M. and Frey, Brendan J.}, volume = {R4}, series = {Proceedings of Machine Learning Research}, month = {03--06 Jan}, publisher = {PMLR}, pdf = {http://proceedings.mlr.press/r4/bengio03a/bengio03a.pdf}, url = {https://proceedings.mlr.press/r4/bengio03a.html}, abstract = {Our previous work on statistical language modeling introduced the use of probabilistic feedforward neural networks to help dealing with the curse of dimensionality. Training this model by maximum likelihood however requires for each example to perform as many network passes as there are words in the vocabulary. Inspired by the contrastive divergence model, we propose and evaluate sampling-based methods which require network passes only for the observed "positive example" and a few sampled negative example words. A very significant speed-up is obtained with an adaptive importance sampling.}, note = {Reissued by PMLR on 01 April 2021.} }
Endnote
%0 Conference Paper %T Quick Training of Probabilistic Neural Nets by Importance Sampling %A Yoshua Bengio %A Jean-Sébastien Senecal %B Proceedings of the Ninth International Workshop on Artificial Intelligence and Statistics %C Proceedings of Machine Learning Research %D 2003 %E Christopher M. Bishop %E Brendan J. Frey %F pmlr-vR4-bengio03a %I PMLR %P 17--24 %U https://proceedings.mlr.press/r4/bengio03a.html %V R4 %X Our previous work on statistical language modeling introduced the use of probabilistic feedforward neural networks to help dealing with the curse of dimensionality. Training this model by maximum likelihood however requires for each example to perform as many network passes as there are words in the vocabulary. Inspired by the contrastive divergence model, we propose and evaluate sampling-based methods which require network passes only for the observed "positive example" and a few sampled negative example words. A very significant speed-up is obtained with an adaptive importance sampling. %Z Reissued by PMLR on 01 April 2021.
APA
Bengio, Y. & Senecal, J.. (2003). Quick Training of Probabilistic Neural Nets by Importance Sampling. Proceedings of the Ninth International Workshop on Artificial Intelligence and Statistics, in Proceedings of Machine Learning Research R4:17-24 Available from https://proceedings.mlr.press/r4/bengio03a.html. Reissued by PMLR on 01 April 2021.

Related Material