Learning solutions to hybrid control problems using Benders cuts

Sandeep Menta; Joseph Warrington; John Lygeros; Manfred Morari

Learning solutions to hybrid control problems using Benders cuts

Sandeep Menta, Joseph Warrington, John Lygeros, Manfred Morari

Proceedings of the 2nd Conference on Learning for Dynamics and Control, PMLR 120:118-126, 2020.

Abstract

Hybrid control problems are complicated by the need to make a suitable sequence of discrete decisions related to future modes of operation of the system. Model predictive control (MPC) encodes a finite-horizon truncation of such problems as a mixed-integer program, and then imposes a cost and/or constraints on the terminal state intended to reflect all post-horizon behaviour. However, these are often ad hoc choices tuned by hand after empirically observing performance. We present a learning method that sidesteps this problem, in which the so-called N-step Q-function of the problem is approximated from below, using Benders’ decomposition. The function takes a state and a sequence of N control decisions as arguments, and therefore extends the traditional notion of a Q-function from reinforcement learning. After learning it from a training process exploring the state-input space, we use it in place of the usual MPC objective. We take an example hybrid control task and show that it can be completed successfully with a shorter planning horizon than conventional hybrid MPC thanks to our proposed method. Furthermore, we report that Q-functions trained with long horizons can be truncated to a shorter horizon for online use, yielding simpler control laws with apparently little loss of performance.

Cite this Paper

BibTeX


@InProceedings{pmlr-v120-menta20a,
  title = 	 {Learning solutions to hybrid control problems using Benders cuts},
  author =       {Menta, Sandeep and Warrington, Joseph and Lygeros, John and Morari, Manfred},
  booktitle = 	 {Proceedings of the 2nd Conference on Learning for Dynamics and Control},
  pages = 	 {118--126},
  year = 	 {2020},
  editor = 	 {Bayen, Alexandre M. and Jadbabaie, Ali and Pappas, George and Parrilo, Pablo A. and Recht, Benjamin and Tomlin, Claire and Zeilinger, Melanie},
  volume = 	 {120},
  series = 	 {Proceedings of Machine Learning Research},
  month = 	 {10--11 Jun},
  publisher =    {PMLR},
  pdf = 	 {http://proceedings.mlr.press/v120/menta20a/menta20a.pdf},
  url = 	 {https://proceedings.mlr.press/v120/menta20a.html},
  abstract = 	 {Hybrid control problems are complicated by the need to make a suitable sequence of discrete decisions related to future modes of operation of the system. Model predictive control (MPC) encodes a finite-horizon truncation of such problems as a mixed-integer program, and then imposes a cost and/or constraints on the terminal state intended to reflect all post-horizon behaviour. However, these are often ad hoc choices tuned by hand after empirically observing performance. We present a learning method that sidesteps this problem, in which the so-called N-step Q-function of the problem is approximated from below, using Benders’ decomposition. The function takes a state and a sequence of N control decisions as arguments, and therefore extends the traditional notion of a Q-function from reinforcement learning. After learning it from a training process exploring the state-input space, we use it in place of the usual MPC objective. We take an example hybrid control task and show that it can be completed successfully with a shorter planning horizon than conventional hybrid MPC thanks to our proposed method. Furthermore, we report that Q-functions trained with long horizons can be truncated to a shorter horizon for online use, yielding simpler control laws with apparently little loss of performance.}
}

Endnote

%0 Conference Paper
%T Learning solutions to hybrid control problems using Benders cuts
%A Sandeep Menta
%A Joseph Warrington
%A John Lygeros
%A Manfred Morari
%B Proceedings of the 2nd Conference on Learning for Dynamics and Control
%C Proceedings of Machine Learning Research
%D 2020
%E Alexandre M. Bayen
%E Ali Jadbabaie
%E George Pappas
%E Pablo A. Parrilo
%E Benjamin Recht
%E Claire Tomlin
%E Melanie Zeilinger	
%F pmlr-v120-menta20a
%I PMLR
%P 118--126
%U https://proceedings.mlr.press/v120/menta20a.html
%V 120
%X Hybrid control problems are complicated by the need to make a suitable sequence of discrete decisions related to future modes of operation of the system. Model predictive control (MPC) encodes a finite-horizon truncation of such problems as a mixed-integer program, and then imposes a cost and/or constraints on the terminal state intended to reflect all post-horizon behaviour. However, these are often ad hoc choices tuned by hand after empirically observing performance. We present a learning method that sidesteps this problem, in which the so-called N-step Q-function of the problem is approximated from below, using Benders’ decomposition. The function takes a state and a sequence of N control decisions as arguments, and therefore extends the traditional notion of a Q-function from reinforcement learning. After learning it from a training process exploring the state-input space, we use it in place of the usual MPC objective. We take an example hybrid control task and show that it can be completed successfully with a shorter planning horizon than conventional hybrid MPC thanks to our proposed method. Furthermore, we report that Q-functions trained with long horizons can be truncated to a shorter horizon for online use, yielding simpler control laws with apparently little loss of performance.

APA


Menta, S., Warrington, J., Lygeros, J. & Morari, M.. (2020). Learning solutions to hybrid control problems using Benders cuts. Proceedings of the 2nd Conference on Learning for Dynamics and Control, in Proceedings of Machine Learning Research 120:118-126 Available from https://proceedings.mlr.press/v120/menta20a.html.

Related Material

Download PDF