Similarity-based Contrastive Divergence Methods for Energy-based Deep Learning Models

Adepu Ravi Sankar; Vineeth N Balasubramanian

Similarity-based Contrastive Divergence Methods for Energy-based Deep Learning Models

Adepu Ravi Sankar, Vineeth N Balasubramanian

Asian Conference on Machine Learning, PMLR 45:391-406, 2016.

Abstract

Energy-based deep learning models like Restricted Boltzmann Machines are increasingly used for real-world applications. However, all these models inherently depend on the Contrastive Divergence (CD) method for training and maximization of log likelihood of generating the given data distribution. CD, which internally uses Gibbs sampling, often does not perform well due to issues such as biased samples, poor mixing of Markov chains and high-mass probability modes. Variants of CD such as PCD, Fast PCD and Tempered MCMC have been proposed to address this issue. In this work, we propose a new approach to CD-based methods, called Diss-CD, which uses dissimilar data to allow the Markov chain to explore new modes in the probability space. This method can be used with all variants of CD (or PCD), and across all energy-based deep learning models. Our experiments on using this approach on standard datasets including MNIST, Caltech-101 Silhouette and Synthetic Transformations, demonstrate the promise of this approach, showing fast convergence of error in learning and also a better approximation of log likelihood of the data.

Cite this Paper

BibTeX


@InProceedings{pmlr-v45-Sankar15,
  title = 	 {Similarity-based Contrastive Divergence Methods for Energy-based Deep Learning Models},
  author = 	 {Sankar, Adepu Ravi and Balasubramanian, Vineeth N},
  booktitle = 	 {Asian Conference on Machine Learning},
  pages = 	 {391--406},
  year = 	 {2016},
  editor = 	 {Holmes, Geoffrey and Liu, Tie-Yan},
  volume = 	 {45},
  series = 	 {Proceedings of Machine Learning Research},
  address = 	 {Hong Kong},
  month = 	 {20--22 Nov},
  publisher =    {PMLR},
  pdf = 	 {http://proceedings.mlr.press/v45/Sankar15.pdf},
  url = 	 {https://proceedings.mlr.press/v45/Sankar15.html},
  abstract = 	 {Energy-based deep learning models like Restricted Boltzmann Machines are increasingly used for real-world applications. However, all these models inherently depend on the Contrastive Divergence (CD) method for training and maximization of log likelihood of generating the given data distribution. CD, which internally uses Gibbs sampling, often does not perform well due to issues such as biased samples, poor mixing of Markov chains and high-mass probability modes. Variants of CD such as PCD, Fast PCD and Tempered MCMC have been proposed to address this issue. In this work, we propose a new approach to CD-based methods, called Diss-CD, which uses dissimilar data to allow the Markov chain to explore new modes in the probability space. This method can be used with all variants of CD (or PCD), and across all energy-based deep learning models. Our experiments on using this approach on standard datasets including MNIST, Caltech-101 Silhouette and Synthetic Transformations, demonstrate the promise of this approach, showing fast convergence of error in learning and also a better approximation of log likelihood of the data. }
}

Endnote

%0 Conference Paper
%T Similarity-based Contrastive Divergence Methods for Energy-based Deep Learning Models
%A Adepu Ravi Sankar
%A Vineeth N Balasubramanian
%B Asian Conference on Machine Learning
%C Proceedings of Machine Learning Research
%D 2016
%E Geoffrey Holmes
%E Tie-Yan Liu	
%F pmlr-v45-Sankar15
%I PMLR
%P 391--406
%U https://proceedings.mlr.press/v45/Sankar15.html
%V 45
%X Energy-based deep learning models like Restricted Boltzmann Machines are increasingly used for real-world applications. However, all these models inherently depend on the Contrastive Divergence (CD) method for training and maximization of log likelihood of generating the given data distribution. CD, which internally uses Gibbs sampling, often does not perform well due to issues such as biased samples, poor mixing of Markov chains and high-mass probability modes. Variants of CD such as PCD, Fast PCD and Tempered MCMC have been proposed to address this issue. In this work, we propose a new approach to CD-based methods, called Diss-CD, which uses dissimilar data to allow the Markov chain to explore new modes in the probability space. This method can be used with all variants of CD (or PCD), and across all energy-based deep learning models. Our experiments on using this approach on standard datasets including MNIST, Caltech-101 Silhouette and Synthetic Transformations, demonstrate the promise of this approach, showing fast convergence of error in learning and also a better approximation of log likelihood of the data.

RIS


TY  - CPAPER
TI  - Similarity-based Contrastive Divergence Methods for Energy-based Deep Learning Models
AU  - Adepu Ravi Sankar
AU  - Vineeth N Balasubramanian
BT  - Asian Conference on Machine Learning
DA  - 2016/02/25
ED  - Geoffrey Holmes
ED  - Tie-Yan Liu	
ID  - pmlr-v45-Sankar15
PB  - PMLR
DP  - Proceedings of Machine Learning Research
VL  - 45
SP  - 391
EP  - 406
L1  - http://proceedings.mlr.press/v45/Sankar15.pdf
UR  - https://proceedings.mlr.press/v45/Sankar15.html
AB  - Energy-based deep learning models like Restricted Boltzmann Machines are increasingly used for real-world applications. However, all these models inherently depend on the Contrastive Divergence (CD) method for training and maximization of log likelihood of generating the given data distribution. CD, which internally uses Gibbs sampling, often does not perform well due to issues such as biased samples, poor mixing of Markov chains and high-mass probability modes. Variants of CD such as PCD, Fast PCD and Tempered MCMC have been proposed to address this issue. In this work, we propose a new approach to CD-based methods, called Diss-CD, which uses dissimilar data to allow the Markov chain to explore new modes in the probability space. This method can be used with all variants of CD (or PCD), and across all energy-based deep learning models. Our experiments on using this approach on standard datasets including MNIST, Caltech-101 Silhouette and Synthetic Transformations, demonstrate the promise of this approach, showing fast convergence of error in learning and also a better approximation of log likelihood of the data. 
ER  -

APA


Sankar, A.R. & Balasubramanian, V.N.. (2016). Similarity-based Contrastive Divergence Methods for Energy-based Deep Learning Models. Asian Conference on Machine Learning, in Proceedings of Machine Learning Research 45:391-406 Available from https://proceedings.mlr.press/v45/Sankar15.html.

Related Material

Download PDF