Black-Box Policy Search with Probabilistic Programs

Jan-Willem Vandemeent, Brooks Paige, David Tolpin, Frank Wood
; Proceedings of the 19th International Conference on Artificial Intelligence and Statistics, PMLR 51:1195-1204, 2016.

Abstract

In this work we show how to represent policies as programs: that is, as stochastic simulators with tunable parameters. To learn the parameters of such policies we develop connections between black box variational inference and existing policy search approaches. We then explain how such learning can be implemented in a probabilistic programming system. Using our own novel implementation of such a system we demonstrate both conciseness of policy representation and automatic policy parameter learning for a set of canonical reinforcement learning problems.

Cite this Paper


BibTeX
@InProceedings{pmlr-v51-vandemeent16, title = {Black-Box Policy Search with Probabilistic Programs}, author = {Jan-Willem Vandemeent and Brooks Paige and David Tolpin and Frank Wood}, pages = {1195--1204}, year = {2016}, editor = {Arthur Gretton and Christian C. Robert}, volume = {51}, series = {Proceedings of Machine Learning Research}, address = {Cadiz, Spain}, month = {09--11 May}, publisher = {PMLR}, pdf = {http://proceedings.mlr.press/v51/vandemeent16.pdf}, url = {http://proceedings.mlr.press/v51/vandemeent16.html}, abstract = {In this work we show how to represent policies as programs: that is, as stochastic simulators with tunable parameters. To learn the parameters of such policies we develop connections between black box variational inference and existing policy search approaches. We then explain how such learning can be implemented in a probabilistic programming system. Using our own novel implementation of such a system we demonstrate both conciseness of policy representation and automatic policy parameter learning for a set of canonical reinforcement learning problems.} }
Endnote
%0 Conference Paper %T Black-Box Policy Search with Probabilistic Programs %A Jan-Willem Vandemeent %A Brooks Paige %A David Tolpin %A Frank Wood %B Proceedings of the 19th International Conference on Artificial Intelligence and Statistics %C Proceedings of Machine Learning Research %D 2016 %E Arthur Gretton %E Christian C. Robert %F pmlr-v51-vandemeent16 %I PMLR %J Proceedings of Machine Learning Research %P 1195--1204 %U http://proceedings.mlr.press %V 51 %W PMLR %X In this work we show how to represent policies as programs: that is, as stochastic simulators with tunable parameters. To learn the parameters of such policies we develop connections between black box variational inference and existing policy search approaches. We then explain how such learning can be implemented in a probabilistic programming system. Using our own novel implementation of such a system we demonstrate both conciseness of policy representation and automatic policy parameter learning for a set of canonical reinforcement learning problems.
RIS
TY - CPAPER TI - Black-Box Policy Search with Probabilistic Programs AU - Jan-Willem Vandemeent AU - Brooks Paige AU - David Tolpin AU - Frank Wood BT - Proceedings of the 19th International Conference on Artificial Intelligence and Statistics PY - 2016/05/02 DA - 2016/05/02 ED - Arthur Gretton ED - Christian C. Robert ID - pmlr-v51-vandemeent16 PB - PMLR SP - 1195 DP - PMLR EP - 1204 L1 - http://proceedings.mlr.press/v51/vandemeent16.pdf UR - http://proceedings.mlr.press/v51/vandemeent16.html AB - In this work we show how to represent policies as programs: that is, as stochastic simulators with tunable parameters. To learn the parameters of such policies we develop connections between black box variational inference and existing policy search approaches. We then explain how such learning can be implemented in a probabilistic programming system. Using our own novel implementation of such a system we demonstrate both conciseness of policy representation and automatic policy parameter learning for a set of canonical reinforcement learning problems. ER -
APA
Vandemeent, J., Paige, B., Tolpin, D. & Wood, F.. (2016). Black-Box Policy Search with Probabilistic Programs. Proceedings of the 19th International Conference on Artificial Intelligence and Statistics, in PMLR 51:1195-1204

Related Material