Learning A Risk-Aware Trajectory Planner From Demonstrations Using Logic Monitor

Xiao Li, Jonathan DeCastro, Cristian Ioan Vasile, Sertac Karaman, Daniela Rus
Proceedings of the 5th Conference on Robot Learning, PMLR 164:1326-1335, 2022.

Abstract

Risk awareness is an important factor to consider when deploying policies on robots in the real-world. Defining the right set of risk metrics can be difficult. In this work, we use a differentiable logic monitor that keeps track of the environmental agents’ behaviors and provides a risk metric that the controlled agent can incorporate during planning. We introduce LogicRiskNet, a learning structure that can be constructed from temporal logic formulas describing rules governing a safe agent’s behaviors. The network’s parameters can be learned from demonstration data. By using temporal logic, the network provides an interpretable architecture that can explain what risk metrics are important to the human. We integrate LogicRiskNet in an inverse optimal control (IOC) framework and show that we can learn to generate trajectory plans that accurately mimic the expert’s risk handling behaviors solely from demonstration data. We evaluate our method on a real-world driving dataset.

Cite this Paper


BibTeX
@InProceedings{pmlr-v164-li22c, title = {Learning A Risk-Aware Trajectory Planner From Demonstrations Using Logic Monitor}, author = {Li, Xiao and DeCastro, Jonathan and Vasile, Cristian Ioan and Karaman, Sertac and Rus, Daniela}, booktitle = {Proceedings of the 5th Conference on Robot Learning}, pages = {1326--1335}, year = {2022}, editor = {Faust, Aleksandra and Hsu, David and Neumann, Gerhard}, volume = {164}, series = {Proceedings of Machine Learning Research}, month = {08--11 Nov}, publisher = {PMLR}, pdf = {https://proceedings.mlr.press/v164/li22c/li22c.pdf}, url = {https://proceedings.mlr.press/v164/li22c.html}, abstract = {Risk awareness is an important factor to consider when deploying policies on robots in the real-world. Defining the right set of risk metrics can be difficult. In this work, we use a differentiable logic monitor that keeps track of the environmental agents’ behaviors and provides a risk metric that the controlled agent can incorporate during planning. We introduce LogicRiskNet, a learning structure that can be constructed from temporal logic formulas describing rules governing a safe agent’s behaviors. The network’s parameters can be learned from demonstration data. By using temporal logic, the network provides an interpretable architecture that can explain what risk metrics are important to the human. We integrate LogicRiskNet in an inverse optimal control (IOC) framework and show that we can learn to generate trajectory plans that accurately mimic the expert’s risk handling behaviors solely from demonstration data. We evaluate our method on a real-world driving dataset. } }
Endnote
%0 Conference Paper %T Learning A Risk-Aware Trajectory Planner From Demonstrations Using Logic Monitor %A Xiao Li %A Jonathan DeCastro %A Cristian Ioan Vasile %A Sertac Karaman %A Daniela Rus %B Proceedings of the 5th Conference on Robot Learning %C Proceedings of Machine Learning Research %D 2022 %E Aleksandra Faust %E David Hsu %E Gerhard Neumann %F pmlr-v164-li22c %I PMLR %P 1326--1335 %U https://proceedings.mlr.press/v164/li22c.html %V 164 %X Risk awareness is an important factor to consider when deploying policies on robots in the real-world. Defining the right set of risk metrics can be difficult. In this work, we use a differentiable logic monitor that keeps track of the environmental agents’ behaviors and provides a risk metric that the controlled agent can incorporate during planning. We introduce LogicRiskNet, a learning structure that can be constructed from temporal logic formulas describing rules governing a safe agent’s behaviors. The network’s parameters can be learned from demonstration data. By using temporal logic, the network provides an interpretable architecture that can explain what risk metrics are important to the human. We integrate LogicRiskNet in an inverse optimal control (IOC) framework and show that we can learn to generate trajectory plans that accurately mimic the expert’s risk handling behaviors solely from demonstration data. We evaluate our method on a real-world driving dataset.
APA
Li, X., DeCastro, J., Vasile, C.I., Karaman, S. & Rus, D.. (2022). Learning A Risk-Aware Trajectory Planner From Demonstrations Using Logic Monitor. Proceedings of the 5th Conference on Robot Learning, in Proceedings of Machine Learning Research 164:1326-1335 Available from https://proceedings.mlr.press/v164/li22c.html.

Related Material