Black-Box Policy Search with Probabilistic Programs

Jan-Willem Vandemeent, Brooks Paige, David Tolpin, Frank Wood
Proceedings of the 19th International Conference on Artificial Intelligence and Statistics, PMLR 51:1195-1204, 2016.

Abstract

In this work we show how to represent policies as programs: that is, as stochastic simulators with tunable parameters. To learn the parameters of such policies we develop connections between black box variational inference and existing policy search approaches. We then explain how such learning can be implemented in a probabilistic programming system. Using our own novel implementation of such a system we demonstrate both conciseness of policy representation and automatic policy parameter learning for a set of canonical reinforcement learning problems.

Cite this Paper


BibTeX
@InProceedings{pmlr-v51-vandemeent16, title = {Black-Box Policy Search with Probabilistic Programs}, author = {Vandemeent, Jan-Willem and Paige, Brooks and Tolpin, David and Wood, Frank}, booktitle = {Proceedings of the 19th International Conference on Artificial Intelligence and Statistics}, pages = {1195--1204}, year = {2016}, editor = {Gretton, Arthur and Robert, Christian C.}, volume = {51}, series = {Proceedings of Machine Learning Research}, address = {Cadiz, Spain}, month = {09--11 May}, publisher = {PMLR}, pdf = {http://proceedings.mlr.press/v51/vandemeent16.pdf}, url = {http://proceedings.mlr.press/v51/vandemeent16.html}, abstract = {In this work we show how to represent policies as programs: that is, as stochastic simulators with tunable parameters. To learn the parameters of such policies we develop connections between black box variational inference and existing policy search approaches. We then explain how such learning can be implemented in a probabilistic programming system. Using our own novel implementation of such a system we demonstrate both conciseness of policy representation and automatic policy parameter learning for a set of canonical reinforcement learning problems.} }
Endnote
%0 Conference Paper %T Black-Box Policy Search with Probabilistic Programs %A Jan-Willem Vandemeent %A Brooks Paige %A David Tolpin %A Frank Wood %B Proceedings of the 19th International Conference on Artificial Intelligence and Statistics %C Proceedings of Machine Learning Research %D 2016 %E Arthur Gretton %E Christian C. Robert %F pmlr-v51-vandemeent16 %I PMLR %P 1195--1204 %U http://proceedings.mlr.press/v51/vandemeent16.html %V 51 %X In this work we show how to represent policies as programs: that is, as stochastic simulators with tunable parameters. To learn the parameters of such policies we develop connections between black box variational inference and existing policy search approaches. We then explain how such learning can be implemented in a probabilistic programming system. Using our own novel implementation of such a system we demonstrate both conciseness of policy representation and automatic policy parameter learning for a set of canonical reinforcement learning problems.
RIS
TY - CPAPER TI - Black-Box Policy Search with Probabilistic Programs AU - Jan-Willem Vandemeent AU - Brooks Paige AU - David Tolpin AU - Frank Wood BT - Proceedings of the 19th International Conference on Artificial Intelligence and Statistics DA - 2016/05/02 ED - Arthur Gretton ED - Christian C. Robert ID - pmlr-v51-vandemeent16 PB - PMLR DP - Proceedings of Machine Learning Research VL - 51 SP - 1195 EP - 1204 L1 - http://proceedings.mlr.press/v51/vandemeent16.pdf UR - http://proceedings.mlr.press/v51/vandemeent16.html AB - In this work we show how to represent policies as programs: that is, as stochastic simulators with tunable parameters. To learn the parameters of such policies we develop connections between black box variational inference and existing policy search approaches. We then explain how such learning can be implemented in a probabilistic programming system. Using our own novel implementation of such a system we demonstrate both conciseness of policy representation and automatic policy parameter learning for a set of canonical reinforcement learning problems. ER -
APA
Vandemeent, J., Paige, B., Tolpin, D. & Wood, F.. (2016). Black-Box Policy Search with Probabilistic Programs. Proceedings of the 19th International Conference on Artificial Intelligence and Statistics, in Proceedings of Machine Learning Research 51:1195-1204 Available from http://proceedings.mlr.press/v51/vandemeent16.html.

Related Material