Relationship between PreTraining and Maximum Likelihood Estimation in Deep Boltzmann Machines

Muneki Yasuda
Proceedings of the 19th International Conference on Artificial Intelligence and Statistics, PMLR 51:582-590, 2016.

Abstract

A pretraining algorithm, which is a layer-by-layer greedy learning algorithm, for a deep Boltzmann machine (DBM) is presented in this paper. By considering the deep belief net type of pretraining for the DBM, which is a simplified version of the original pretraining of the DBM, two interesting theoretical facts about pretraining can be obtained. (1) By applying two different types of approximation, a replacing approximation by using a Bayesian network and a Bethe type of approximation based on the cluster variation method, to two different parts of the true log-likelihood function of the DBM, pretraining can be derived from a variational approximation of the original maximum likelihood estimation. (2) It can be ensured that the pretraining improves the variational bound of the true log-likelihood function of the DBM. These two theoretical results will help deepen our understanding of deep learning. Moreover, on the basis of the theoretical results, we discuss the original pretraining of the DBM in the latter part of this paper.

Cite this Paper


BibTeX
@InProceedings{pmlr-v51-yasuda16, title = {Relationship between PreTraining and Maximum Likelihood Estimation in Deep Boltzmann Machines}, author = {Yasuda, Muneki}, booktitle = {Proceedings of the 19th International Conference on Artificial Intelligence and Statistics}, pages = {582--590}, year = {2016}, editor = {Gretton, Arthur and Robert, Christian C.}, volume = {51}, series = {Proceedings of Machine Learning Research}, address = {Cadiz, Spain}, month = {09--11 May}, publisher = {PMLR}, pdf = {http://proceedings.mlr.press/v51/yasuda16.pdf}, url = {https://proceedings.mlr.press/v51/yasuda16.html}, abstract = {A pretraining algorithm, which is a layer-by-layer greedy learning algorithm, for a deep Boltzmann machine (DBM) is presented in this paper. By considering the deep belief net type of pretraining for the DBM, which is a simplified version of the original pretraining of the DBM, two interesting theoretical facts about pretraining can be obtained. (1) By applying two different types of approximation, a replacing approximation by using a Bayesian network and a Bethe type of approximation based on the cluster variation method, to two different parts of the true log-likelihood function of the DBM, pretraining can be derived from a variational approximation of the original maximum likelihood estimation. (2) It can be ensured that the pretraining improves the variational bound of the true log-likelihood function of the DBM. These two theoretical results will help deepen our understanding of deep learning. Moreover, on the basis of the theoretical results, we discuss the original pretraining of the DBM in the latter part of this paper.} }
Endnote
%0 Conference Paper %T Relationship between PreTraining and Maximum Likelihood Estimation in Deep Boltzmann Machines %A Muneki Yasuda %B Proceedings of the 19th International Conference on Artificial Intelligence and Statistics %C Proceedings of Machine Learning Research %D 2016 %E Arthur Gretton %E Christian C. Robert %F pmlr-v51-yasuda16 %I PMLR %P 582--590 %U https://proceedings.mlr.press/v51/yasuda16.html %V 51 %X A pretraining algorithm, which is a layer-by-layer greedy learning algorithm, for a deep Boltzmann machine (DBM) is presented in this paper. By considering the deep belief net type of pretraining for the DBM, which is a simplified version of the original pretraining of the DBM, two interesting theoretical facts about pretraining can be obtained. (1) By applying two different types of approximation, a replacing approximation by using a Bayesian network and a Bethe type of approximation based on the cluster variation method, to two different parts of the true log-likelihood function of the DBM, pretraining can be derived from a variational approximation of the original maximum likelihood estimation. (2) It can be ensured that the pretraining improves the variational bound of the true log-likelihood function of the DBM. These two theoretical results will help deepen our understanding of deep learning. Moreover, on the basis of the theoretical results, we discuss the original pretraining of the DBM in the latter part of this paper.
RIS
TY - CPAPER TI - Relationship between PreTraining and Maximum Likelihood Estimation in Deep Boltzmann Machines AU - Muneki Yasuda BT - Proceedings of the 19th International Conference on Artificial Intelligence and Statistics DA - 2016/05/02 ED - Arthur Gretton ED - Christian C. Robert ID - pmlr-v51-yasuda16 PB - PMLR DP - Proceedings of Machine Learning Research VL - 51 SP - 582 EP - 590 L1 - http://proceedings.mlr.press/v51/yasuda16.pdf UR - https://proceedings.mlr.press/v51/yasuda16.html AB - A pretraining algorithm, which is a layer-by-layer greedy learning algorithm, for a deep Boltzmann machine (DBM) is presented in this paper. By considering the deep belief net type of pretraining for the DBM, which is a simplified version of the original pretraining of the DBM, two interesting theoretical facts about pretraining can be obtained. (1) By applying two different types of approximation, a replacing approximation by using a Bayesian network and a Bethe type of approximation based on the cluster variation method, to two different parts of the true log-likelihood function of the DBM, pretraining can be derived from a variational approximation of the original maximum likelihood estimation. (2) It can be ensured that the pretraining improves the variational bound of the true log-likelihood function of the DBM. These two theoretical results will help deepen our understanding of deep learning. Moreover, on the basis of the theoretical results, we discuss the original pretraining of the DBM in the latter part of this paper. ER -
APA
Yasuda, M.. (2016). Relationship between PreTraining and Maximum Likelihood Estimation in Deep Boltzmann Machines. Proceedings of the 19th International Conference on Artificial Intelligence and Statistics, in Proceedings of Machine Learning Research 51:582-590 Available from https://proceedings.mlr.press/v51/yasuda16.html.

Related Material