Parallel tempering on optimized paths

Saifuddin Syed, Vittorio Romaniello, Trevor Campbell, Alexandre Bouchard-Cote
Proceedings of the 38th International Conference on Machine Learning, PMLR 139:10033-10042, 2021.

Abstract

Parallel tempering (PT) is a class of Markov chain Monte Carlo algorithms that constructs a path of distributions annealing between a tractable reference and an intractable target, and then interchanges states along the path to improve mixing in the target. The performance of PT depends on how quickly a sample from the reference distribution makes its way to the target, which in turn depends on the particular path of annealing distributions. However, past work on PT has used only simple paths constructed from convex combinations of the reference and target log-densities. This paper begins by demonstrating that this path performs poorly in the setting where the reference and target are nearly mutually singular. To address this issue, we expand the framework of PT to general families of paths, formulate the choice of path as an optimization problem that admits tractable gradient estimates, and propose a flexible new family of spline interpolation paths for use in practice. Theoretical and empirical results both demonstrate that our proposed methodology breaks previously-established upper performance limits for traditional paths.

Cite this Paper


BibTeX
@InProceedings{pmlr-v139-syed21a, title = {Parallel tempering on optimized paths}, author = {Syed, Saifuddin and Romaniello, Vittorio and Campbell, Trevor and Bouchard-Cote, Alexandre}, booktitle = {Proceedings of the 38th International Conference on Machine Learning}, pages = {10033--10042}, year = {2021}, editor = {Meila, Marina and Zhang, Tong}, volume = {139}, series = {Proceedings of Machine Learning Research}, month = {18--24 Jul}, publisher = {PMLR}, pdf = {http://proceedings.mlr.press/v139/syed21a/syed21a.pdf}, url = {https://proceedings.mlr.press/v139/syed21a.html}, abstract = {Parallel tempering (PT) is a class of Markov chain Monte Carlo algorithms that constructs a path of distributions annealing between a tractable reference and an intractable target, and then interchanges states along the path to improve mixing in the target. The performance of PT depends on how quickly a sample from the reference distribution makes its way to the target, which in turn depends on the particular path of annealing distributions. However, past work on PT has used only simple paths constructed from convex combinations of the reference and target log-densities. This paper begins by demonstrating that this path performs poorly in the setting where the reference and target are nearly mutually singular. To address this issue, we expand the framework of PT to general families of paths, formulate the choice of path as an optimization problem that admits tractable gradient estimates, and propose a flexible new family of spline interpolation paths for use in practice. Theoretical and empirical results both demonstrate that our proposed methodology breaks previously-established upper performance limits for traditional paths.} }
Endnote
%0 Conference Paper %T Parallel tempering on optimized paths %A Saifuddin Syed %A Vittorio Romaniello %A Trevor Campbell %A Alexandre Bouchard-Cote %B Proceedings of the 38th International Conference on Machine Learning %C Proceedings of Machine Learning Research %D 2021 %E Marina Meila %E Tong Zhang %F pmlr-v139-syed21a %I PMLR %P 10033--10042 %U https://proceedings.mlr.press/v139/syed21a.html %V 139 %X Parallel tempering (PT) is a class of Markov chain Monte Carlo algorithms that constructs a path of distributions annealing between a tractable reference and an intractable target, and then interchanges states along the path to improve mixing in the target. The performance of PT depends on how quickly a sample from the reference distribution makes its way to the target, which in turn depends on the particular path of annealing distributions. However, past work on PT has used only simple paths constructed from convex combinations of the reference and target log-densities. This paper begins by demonstrating that this path performs poorly in the setting where the reference and target are nearly mutually singular. To address this issue, we expand the framework of PT to general families of paths, formulate the choice of path as an optimization problem that admits tractable gradient estimates, and propose a flexible new family of spline interpolation paths for use in practice. Theoretical and empirical results both demonstrate that our proposed methodology breaks previously-established upper performance limits for traditional paths.
APA
Syed, S., Romaniello, V., Campbell, T. & Bouchard-Cote, A.. (2021). Parallel tempering on optimized paths. Proceedings of the 38th International Conference on Machine Learning, in Proceedings of Machine Learning Research 139:10033-10042 Available from https://proceedings.mlr.press/v139/syed21a.html.

Related Material