OptNet: Differentiable Optimization as a Layer in Neural Networks

Brandon Amos; J. Zico Kolter

OptNet: Differentiable Optimization as a Layer in Neural Networks

Brandon Amos, J. Zico Kolter

Proceedings of the 34th International Conference on Machine Learning, PMLR 70:136-145, 2017.

Abstract

This paper presents OptNet, a network architecture that integrates optimization problems (here, specifically in the form of quadratic programs) as individual layers in larger end-to-end trainable deep networks. These layers encode constraints and complex dependencies between the hidden states that traditional convolutional and fully-connected layers often cannot capture. In this paper, we explore the foundations for such an architecture: we show how techniques from sensitivity analysis, bilevel optimization, and implicit differentiation can be used to exactly differentiate through these layers and with respect to layer parameters; we develop a highly efficient solver for these layers that exploits fast GPU-based batch solves within a primal-dual interior point method, and which provides backpropagation gradients with virtually no additional cost on top of the solve; and we highlight the application of these approaches in several problems. In one notable example, we show that the method is capable of learning to play mini-Sudoku (4x4) given just input and output games, with no a priori information about the rules of the game; this highlights the ability of our architecture to learn hard constraints better than other neural architectures.

Cite this Paper

BibTeX


@InProceedings{pmlr-v70-amos17a,
  title = 	 {{O}pt{N}et: Differentiable Optimization as a Layer in Neural Networks},
  author =       {Brandon Amos and J. Zico Kolter},
  booktitle = 	 {Proceedings of the 34th International Conference on Machine Learning},
  pages = 	 {136--145},
  year = 	 {2017},
  editor = 	 {Precup, Doina and Teh, Yee Whye},
  volume = 	 {70},
  series = 	 {Proceedings of Machine Learning Research},
  month = 	 {06--11 Aug},
  publisher =    {PMLR},
  pdf = 	 {http://proceedings.mlr.press/v70/amos17a/amos17a.pdf},
  url = 	 {https://proceedings.mlr.press/v70/amos17a.html},
  abstract = 	 {This paper presents OptNet, a network architecture that integrates optimization problems (here, specifically in the form of quadratic programs) as individual layers in larger end-to-end trainable deep networks. These layers encode constraints and complex dependencies between the hidden states that traditional convolutional and fully-connected layers often cannot capture. In this paper, we explore the foundations for such an architecture: we show how techniques from sensitivity analysis, bilevel optimization, and implicit differentiation can be used to exactly differentiate through these layers and with respect to layer parameters; we develop a highly efficient solver for these layers that exploits fast GPU-based batch solves within a primal-dual interior point method, and which provides backpropagation gradients with virtually no additional cost on top of the solve; and we highlight the application of these approaches in several problems. In one notable example, we show that the method is capable of learning to play mini-Sudoku (4x4) given just input and output games, with no a priori information about the rules of the game; this highlights the ability of our architecture to learn hard constraints better than other neural architectures.}
}

Endnote

%0 Conference Paper
%T OptNet: Differentiable Optimization as a Layer in Neural Networks
%A Brandon Amos
%A J. Zico Kolter
%B Proceedings of the 34th International Conference on Machine Learning
%C Proceedings of Machine Learning Research
%D 2017
%E Doina Precup
%E Yee Whye Teh	
%F pmlr-v70-amos17a
%I PMLR
%P 136--145
%U https://proceedings.mlr.press/v70/amos17a.html
%V 70
%X This paper presents OptNet, a network architecture that integrates optimization problems (here, specifically in the form of quadratic programs) as individual layers in larger end-to-end trainable deep networks. These layers encode constraints and complex dependencies between the hidden states that traditional convolutional and fully-connected layers often cannot capture. In this paper, we explore the foundations for such an architecture: we show how techniques from sensitivity analysis, bilevel optimization, and implicit differentiation can be used to exactly differentiate through these layers and with respect to layer parameters; we develop a highly efficient solver for these layers that exploits fast GPU-based batch solves within a primal-dual interior point method, and which provides backpropagation gradients with virtually no additional cost on top of the solve; and we highlight the application of these approaches in several problems. In one notable example, we show that the method is capable of learning to play mini-Sudoku (4x4) given just input and output games, with no a priori information about the rules of the game; this highlights the ability of our architecture to learn hard constraints better than other neural architectures.

APA


Amos, B. & Kolter, J.Z.. (2017). OptNet: Differentiable Optimization as a Layer in Neural Networks. Proceedings of the 34th International Conference on Machine Learning, in Proceedings of Machine Learning Research 70:136-145 Available from https://proceedings.mlr.press/v70/amos17a.html.

OptNet: Differentiable Optimization as a Layer in Neural Networks

Abstract

Cite this Paper

Related Material