Unbiased Gradient Estimation in Unrolled Computation Graphs with Persistent Evolution Strategies

Paul Vicol, Luke Metz, Jascha Sohl-Dickstein
Proceedings of the 38th International Conference on Machine Learning, PMLR 139:10553-10563, 2021.

Abstract

Unrolled computation graphs arise in many scenarios, including training RNNs, tuning hyperparameters through unrolled optimization, and training learned optimizers. Current approaches to optimizing parameters in such computation graphs suffer from high variance gradients, bias, slow updates, or large memory usage. We introduce a method called Persistent Evolution Strategies (PES), which divides the computation graph into a series of truncated unrolls, and performs an evolution strategies-based update step after each unroll. PES eliminates bias from these truncations by accumulating correction terms over the entire sequence of unrolls. PES allows for rapid parameter updates, has low memory usage, is unbiased, and has reasonable variance characteristics. We experimentally demonstrate the advantages of PES compared to several other methods for gradient estimation on synthetic tasks, and show its applicability to training learned optimizers and tuning hyperparameters.

Cite this Paper


BibTeX
@InProceedings{pmlr-v139-vicol21a, title = {Unbiased Gradient Estimation in Unrolled Computation Graphs with Persistent Evolution Strategies}, author = {Vicol, Paul and Metz, Luke and Sohl-Dickstein, Jascha}, booktitle = {Proceedings of the 38th International Conference on Machine Learning}, pages = {10553--10563}, year = {2021}, editor = {Meila, Marina and Zhang, Tong}, volume = {139}, series = {Proceedings of Machine Learning Research}, month = {18--24 Jul}, publisher = {PMLR}, pdf = {http://proceedings.mlr.press/v139/vicol21a/vicol21a.pdf}, url = {https://proceedings.mlr.press/v139/vicol21a.html}, abstract = {Unrolled computation graphs arise in many scenarios, including training RNNs, tuning hyperparameters through unrolled optimization, and training learned optimizers. Current approaches to optimizing parameters in such computation graphs suffer from high variance gradients, bias, slow updates, or large memory usage. We introduce a method called Persistent Evolution Strategies (PES), which divides the computation graph into a series of truncated unrolls, and performs an evolution strategies-based update step after each unroll. PES eliminates bias from these truncations by accumulating correction terms over the entire sequence of unrolls. PES allows for rapid parameter updates, has low memory usage, is unbiased, and has reasonable variance characteristics. We experimentally demonstrate the advantages of PES compared to several other methods for gradient estimation on synthetic tasks, and show its applicability to training learned optimizers and tuning hyperparameters.} }
Endnote
%0 Conference Paper %T Unbiased Gradient Estimation in Unrolled Computation Graphs with Persistent Evolution Strategies %A Paul Vicol %A Luke Metz %A Jascha Sohl-Dickstein %B Proceedings of the 38th International Conference on Machine Learning %C Proceedings of Machine Learning Research %D 2021 %E Marina Meila %E Tong Zhang %F pmlr-v139-vicol21a %I PMLR %P 10553--10563 %U https://proceedings.mlr.press/v139/vicol21a.html %V 139 %X Unrolled computation graphs arise in many scenarios, including training RNNs, tuning hyperparameters through unrolled optimization, and training learned optimizers. Current approaches to optimizing parameters in such computation graphs suffer from high variance gradients, bias, slow updates, or large memory usage. We introduce a method called Persistent Evolution Strategies (PES), which divides the computation graph into a series of truncated unrolls, and performs an evolution strategies-based update step after each unroll. PES eliminates bias from these truncations by accumulating correction terms over the entire sequence of unrolls. PES allows for rapid parameter updates, has low memory usage, is unbiased, and has reasonable variance characteristics. We experimentally demonstrate the advantages of PES compared to several other methods for gradient estimation on synthetic tasks, and show its applicability to training learned optimizers and tuning hyperparameters.
APA
Vicol, P., Metz, L. & Sohl-Dickstein, J.. (2021). Unbiased Gradient Estimation in Unrolled Computation Graphs with Persistent Evolution Strategies. Proceedings of the 38th International Conference on Machine Learning, in Proceedings of Machine Learning Research 139:10553-10563 Available from https://proceedings.mlr.press/v139/vicol21a.html.

Related Material