The Differentiable Cross-Entropy Method

Brandon Amos; Denis Yarats

The Differentiable Cross-Entropy Method

Brandon Amos, Denis Yarats

Proceedings of the 37th International Conference on Machine Learning, PMLR 119:291-302, 2020.

Abstract

We study the Cross-Entropy Method (CEM) for the non-convex optimization of a continuous and parameterized objective function and introduce a differentiable variant that enables us to differentiate the output of CEM with respect to the objective function’s parameters. In the machine learning setting this brings CEM inside of the end-to-end learning pipeline where this has otherwise been impossible. We show applications in a synthetic energy-based structured prediction task and in non-convex continuous control. In the control setting we show how to embed optimal action sequences into a lower-dimensional space. This enables us to use policy optimization to fine-tune modeling components by differentiating through the CEM-based controller.

Cite this Paper

BibTeX

@InProceedings{pmlr-v119-amos20a,
  title = 	 {The Differentiable Cross-Entropy Method},
  author =       {Amos, Brandon and Yarats, Denis},
  booktitle = 	 {Proceedings of the 37th International Conference on Machine Learning},
  pages = 	 {291--302},
  year = 	 {2020},
  editor = 	 {III, Hal Daumé and Singh, Aarti},
  volume = 	 {119},
  series = 	 {Proceedings of Machine Learning Research},
  month = 	 {13--18 Jul},
  publisher =    {PMLR},
  pdf = 	 {http://proceedings.mlr.press/v119/amos20a/amos20a.pdf},
  url = 	 {https://proceedings.mlr.press/v119/amos20a.html},
  abstract = 	 {We study the Cross-Entropy Method (CEM) for the non-convex optimization of a continuous and parameterized objective function and introduce a differentiable variant that enables us to differentiate the output of CEM with respect to the objective function’s parameters. In the machine learning setting this brings CEM inside of the end-to-end learning pipeline where this has otherwise been impossible. We show applications in a synthetic energy-based structured prediction task and in non-convex continuous control. In the control setting we show how to embed optimal action sequences into a lower-dimensional space. This enables us to use policy optimization to fine-tune modeling components by differentiating through the CEM-based controller.}
}

Endnote

%0 Conference Paper
%T The Differentiable Cross-Entropy Method
%A Brandon Amos
%A Denis Yarats
%B Proceedings of the 37th International Conference on Machine Learning
%C Proceedings of Machine Learning Research
%D 2020
%E Hal Daumé III
%E Aarti Singh	
%F pmlr-v119-amos20a
%I PMLR
%P 291--302
%U https://proceedings.mlr.press/v119/amos20a.html
%V 119
%X We study the Cross-Entropy Method (CEM) for the non-convex optimization of a continuous and parameterized objective function and introduce a differentiable variant that enables us to differentiate the output of CEM with respect to the objective function’s parameters. In the machine learning setting this brings CEM inside of the end-to-end learning pipeline where this has otherwise been impossible. We show applications in a synthetic energy-based structured prediction task and in non-convex continuous control. In the control setting we show how to embed optimal action sequences into a lower-dimensional space. This enables us to use policy optimization to fine-tune modeling components by differentiating through the CEM-based controller.

APA

Amos, B. & Yarats, D.. (2020). The Differentiable Cross-Entropy Method. Proceedings of the 37th International Conference on Machine Learning, in Proceedings of Machine Learning Research 119:291-302 Available from https://proceedings.mlr.press/v119/amos20a.html.

The Differentiable Cross-Entropy Method

Abstract

Cite this Paper

Related Material