Layerwise Systematic Scan: Deep Boltzmann Machines and Beyond

[edit]

Heng Guo, Kaan Kara, Ce Zhang ;
Proceedings of the Twenty-First International Conference on Artificial Intelligence and Statistics, PMLR 84:178-187, 2018.

Abstract

For Markov chain Monte Carlo methods, one of the greatest discrepancies between theory and system is the scan order — while most theoretical development on the mixing time analysis deals with random updates, real-world systems are implemented with systematic scans. We bridge this gap for models that exhibit a bipartite structure, including, most notably, the Restricted/Deep Boltzmann Machine. The de facto implementation for these models scans variables in a layer-wise fashion. We show that the Gibbs sampler with a layerwise alternating scan order has its relaxation time (in terms of epochs) no larger than that of a random-update Gibbs sampler (in terms of variable updates). We also construct examples to show that this bound is asymptotically tight. Through standard inequalities, our result also implies a comparison on the mixing times.

Related Material