Similarity-based Contrastive Divergence Methods for Energy-based Deep Learning Models

Adepu Ravi Sankar, Vineeth N Balasubramanian
Asian Conference on Machine Learning, PMLR 45:391-406, 2016.

Abstract

Energy-based deep learning models like Restricted Boltzmann Machines are increasingly used for real-world applications. However, all these models inherently depend on the Contrastive Divergence (CD) method for training and maximization of log likelihood of generating the given data distribution. CD, which internally uses Gibbs sampling, often does not perform well due to issues such as biased samples, poor mixing of Markov chains and high-mass probability modes. Variants of CD such as PCD, Fast PCD and Tempered MCMC have been proposed to address this issue. In this work, we propose a new approach to CD-based methods, called Diss-CD, which uses dissimilar data to allow the Markov chain to explore new modes in the probability space. This method can be used with all variants of CD (or PCD), and across all energy-based deep learning models. Our experiments on using this approach on standard datasets including MNIST, Caltech-101 Silhouette and Synthetic Transformations, demonstrate the promise of this approach, showing fast convergence of error in learning and also a better approximation of log likelihood of the data.

Cite this Paper


BibTeX
@InProceedings{pmlr-v45-Sankar15, title = {Similarity-based Contrastive Divergence Methods for Energy-based Deep Learning Models}, author = {Sankar, Adepu Ravi and Balasubramanian, Vineeth N}, booktitle = {Asian Conference on Machine Learning}, pages = {391--406}, year = {2016}, editor = {Holmes, Geoffrey and Liu, Tie-Yan}, volume = {45}, series = {Proceedings of Machine Learning Research}, address = {Hong Kong}, month = {20--22 Nov}, publisher = {PMLR}, pdf = {http://proceedings.mlr.press/v45/Sankar15.pdf}, url = {https://proceedings.mlr.press/v45/Sankar15.html}, abstract = {Energy-based deep learning models like Restricted Boltzmann Machines are increasingly used for real-world applications. However, all these models inherently depend on the Contrastive Divergence (CD) method for training and maximization of log likelihood of generating the given data distribution. CD, which internally uses Gibbs sampling, often does not perform well due to issues such as biased samples, poor mixing of Markov chains and high-mass probability modes. Variants of CD such as PCD, Fast PCD and Tempered MCMC have been proposed to address this issue. In this work, we propose a new approach to CD-based methods, called Diss-CD, which uses dissimilar data to allow the Markov chain to explore new modes in the probability space. This method can be used with all variants of CD (or PCD), and across all energy-based deep learning models. Our experiments on using this approach on standard datasets including MNIST, Caltech-101 Silhouette and Synthetic Transformations, demonstrate the promise of this approach, showing fast convergence of error in learning and also a better approximation of log likelihood of the data. } }
Endnote
%0 Conference Paper %T Similarity-based Contrastive Divergence Methods for Energy-based Deep Learning Models %A Adepu Ravi Sankar %A Vineeth N Balasubramanian %B Asian Conference on Machine Learning %C Proceedings of Machine Learning Research %D 2016 %E Geoffrey Holmes %E Tie-Yan Liu %F pmlr-v45-Sankar15 %I PMLR %P 391--406 %U https://proceedings.mlr.press/v45/Sankar15.html %V 45 %X Energy-based deep learning models like Restricted Boltzmann Machines are increasingly used for real-world applications. However, all these models inherently depend on the Contrastive Divergence (CD) method for training and maximization of log likelihood of generating the given data distribution. CD, which internally uses Gibbs sampling, often does not perform well due to issues such as biased samples, poor mixing of Markov chains and high-mass probability modes. Variants of CD such as PCD, Fast PCD and Tempered MCMC have been proposed to address this issue. In this work, we propose a new approach to CD-based methods, called Diss-CD, which uses dissimilar data to allow the Markov chain to explore new modes in the probability space. This method can be used with all variants of CD (or PCD), and across all energy-based deep learning models. Our experiments on using this approach on standard datasets including MNIST, Caltech-101 Silhouette and Synthetic Transformations, demonstrate the promise of this approach, showing fast convergence of error in learning and also a better approximation of log likelihood of the data.
RIS
TY - CPAPER TI - Similarity-based Contrastive Divergence Methods for Energy-based Deep Learning Models AU - Adepu Ravi Sankar AU - Vineeth N Balasubramanian BT - Asian Conference on Machine Learning DA - 2016/02/25 ED - Geoffrey Holmes ED - Tie-Yan Liu ID - pmlr-v45-Sankar15 PB - PMLR DP - Proceedings of Machine Learning Research VL - 45 SP - 391 EP - 406 L1 - http://proceedings.mlr.press/v45/Sankar15.pdf UR - https://proceedings.mlr.press/v45/Sankar15.html AB - Energy-based deep learning models like Restricted Boltzmann Machines are increasingly used for real-world applications. However, all these models inherently depend on the Contrastive Divergence (CD) method for training and maximization of log likelihood of generating the given data distribution. CD, which internally uses Gibbs sampling, often does not perform well due to issues such as biased samples, poor mixing of Markov chains and high-mass probability modes. Variants of CD such as PCD, Fast PCD and Tempered MCMC have been proposed to address this issue. In this work, we propose a new approach to CD-based methods, called Diss-CD, which uses dissimilar data to allow the Markov chain to explore new modes in the probability space. This method can be used with all variants of CD (or PCD), and across all energy-based deep learning models. Our experiments on using this approach on standard datasets including MNIST, Caltech-101 Silhouette and Synthetic Transformations, demonstrate the promise of this approach, showing fast convergence of error in learning and also a better approximation of log likelihood of the data. ER -
APA
Sankar, A.R. & Balasubramanian, V.N.. (2016). Similarity-based Contrastive Divergence Methods for Energy-based Deep Learning Models. Asian Conference on Machine Learning, in Proceedings of Machine Learning Research 45:391-406 Available from https://proceedings.mlr.press/v45/Sankar15.html.

Related Material