Black-Box Policy Search with Probabilistic Programs

Jan-Willem Vandemeent; Brooks Paige; David Tolpin; Frank Wood

Black-Box Policy Search with Probabilistic Programs

Jan-Willem Vandemeent, Brooks Paige, David Tolpin, Frank Wood

Proceedings of the 19th International Conference on Artificial Intelligence and Statistics, PMLR 51:1195-1204, 2016.

Abstract

In this work we show how to represent policies as programs: that is, as stochastic simulators with tunable parameters. To learn the parameters of such policies we develop connections between black box variational inference and existing policy search approaches. We then explain how such learning can be implemented in a probabilistic programming system. Using our own novel implementation of such a system we demonstrate both conciseness of policy representation and automatic policy parameter learning for a set of canonical reinforcement learning problems.

Cite this Paper

BibTeX

@InProceedings{pmlr-v51-vandemeent16,
  title = 	 {Black-Box Policy Search with Probabilistic Programs},
  author = 	 {Vandemeent, Jan-Willem and Paige, Brooks and Tolpin, David and Wood, Frank},
  booktitle = 	 {Proceedings of the 19th International Conference on Artificial Intelligence and Statistics},
  pages = 	 {1195--1204},
  year = 	 {2016},
  editor = 	 {Gretton, Arthur and Robert, Christian C.},
  volume = 	 {51},
  series = 	 {Proceedings of Machine Learning Research},
  address = 	 {Cadiz, Spain},
  month = 	 {09--11 May},
  publisher =    {PMLR},
  pdf = 	 {http://proceedings.mlr.press/v51/vandemeent16.pdf},
  url = 	 {https://proceedings.mlr.press/v51/vandemeent16.html},
  abstract = 	 {In this work we show how to represent policies as programs: that is, as stochastic simulators with tunable parameters.  To learn the parameters of such policies we develop connections between black box variational inference and existing policy search approaches.  We then explain how such learning can be implemented in a probabilistic programming system.  Using our own novel implementation of such a system we demonstrate both conciseness of policy representation and automatic policy parameter learning for a set of canonical reinforcement learning problems.}
}

Endnote

%0 Conference Paper
%T Black-Box Policy Search with Probabilistic Programs
%A Jan-Willem Vandemeent
%A Brooks Paige
%A David Tolpin
%A Frank Wood
%B Proceedings of the 19th International Conference on Artificial Intelligence and Statistics
%C Proceedings of Machine Learning Research
%D 2016
%E Arthur Gretton
%E Christian C. Robert	
%F pmlr-v51-vandemeent16
%I PMLR
%P 1195--1204
%U https://proceedings.mlr.press/v51/vandemeent16.html
%V 51
%X In this work we show how to represent policies as programs: that is, as stochastic simulators with tunable parameters.  To learn the parameters of such policies we develop connections between black box variational inference and existing policy search approaches.  We then explain how such learning can be implemented in a probabilistic programming system.  Using our own novel implementation of such a system we demonstrate both conciseness of policy representation and automatic policy parameter learning for a set of canonical reinforcement learning problems.

RIS

TY  - CPAPER
TI  - Black-Box Policy Search with Probabilistic Programs
AU  - Jan-Willem Vandemeent
AU  - Brooks Paige
AU  - David Tolpin
AU  - Frank Wood
BT  - Proceedings of the 19th International Conference on Artificial Intelligence and Statistics
DA  - 2016/05/02
ED  - Arthur Gretton
ED  - Christian C. Robert	
ID  - pmlr-v51-vandemeent16
PB  - PMLR
DP  - Proceedings of Machine Learning Research
VL  - 51
SP  - 1195
EP  - 1204
L1  - http://proceedings.mlr.press/v51/vandemeent16.pdf
UR  - https://proceedings.mlr.press/v51/vandemeent16.html
AB  - In this work we show how to represent policies as programs: that is, as stochastic simulators with tunable parameters.  To learn the parameters of such policies we develop connections between black box variational inference and existing policy search approaches.  We then explain how such learning can be implemented in a probabilistic programming system.  Using our own novel implementation of such a system we demonstrate both conciseness of policy representation and automatic policy parameter learning for a set of canonical reinforcement learning problems.
ER  -

APA

Vandemeent, J., Paige, B., Tolpin, D. & Wood, F.. (2016). Black-Box Policy Search with Probabilistic Programs. Proceedings of the 19th International Conference on Artificial Intelligence and Statistics, in Proceedings of Machine Learning Research 51:1195-1204 Available from https://proceedings.mlr.press/v51/vandemeent16.html.

Black-Box Policy Search with Probabilistic Programs

Abstract

Cite this Paper

Related Material