Including Uncertainty when Learning from Human Corrections

Dylan P. Losey, Marcia K. O’Malley
Proceedings of The 2nd Conference on Robot Learning, PMLR 87:123-132, 2018.

Abstract

It is difficult for humans to efficiently teach robots how to correctly perform a task. One intuitive solution is for the robot to iteratively learn the human’s preferences from corrections, where the human improves the robot’s current behavior at each iteration. When learning from corrections, we argue that while the robot should estimate the most likely human preferences, it should also know what it does not know, and integrate this uncertainty as it makes decisions. We advance the state-of-the-art by introducing a Kalman filter for learning from corrections: this approach obtains the uncertainty of the estimated human preferences. Next, we demonstrate how the estimate uncertainty can be leveraged for active learning and risk-sensitive deployment. Our results indicate that obtaining and leveraging uncertainty leads to faster learning from human corrections.

Cite this Paper


BibTeX
@InProceedings{pmlr-v87-losey18a, title = {Including Uncertainty when Learning from Human Corrections}, author = {Losey, Dylan P. and O'Malley, Marcia K.}, booktitle = {Proceedings of The 2nd Conference on Robot Learning}, pages = {123--132}, year = {2018}, editor = {Billard, Aude and Dragan, Anca and Peters, Jan and Morimoto, Jun}, volume = {87}, series = {Proceedings of Machine Learning Research}, month = {29--31 Oct}, publisher = {PMLR}, pdf = {http://proceedings.mlr.press/v87/losey18a/losey18a.pdf}, url = {https://proceedings.mlr.press/v87/losey18a.html}, abstract = {It is difficult for humans to efficiently teach robots how to correctly perform a task. One intuitive solution is for the robot to iteratively learn the human’s preferences from corrections, where the human improves the robot’s current behavior at each iteration. When learning from corrections, we argue that while the robot should estimate the most likely human preferences, it should also know what it does not know, and integrate this uncertainty as it makes decisions. We advance the state-of-the-art by introducing a Kalman filter for learning from corrections: this approach obtains the uncertainty of the estimated human preferences. Next, we demonstrate how the estimate uncertainty can be leveraged for active learning and risk-sensitive deployment. Our results indicate that obtaining and leveraging uncertainty leads to faster learning from human corrections. } }
Endnote
%0 Conference Paper %T Including Uncertainty when Learning from Human Corrections %A Dylan P. Losey %A Marcia K. O’Malley %B Proceedings of The 2nd Conference on Robot Learning %C Proceedings of Machine Learning Research %D 2018 %E Aude Billard %E Anca Dragan %E Jan Peters %E Jun Morimoto %F pmlr-v87-losey18a %I PMLR %P 123--132 %U https://proceedings.mlr.press/v87/losey18a.html %V 87 %X It is difficult for humans to efficiently teach robots how to correctly perform a task. One intuitive solution is for the robot to iteratively learn the human’s preferences from corrections, where the human improves the robot’s current behavior at each iteration. When learning from corrections, we argue that while the robot should estimate the most likely human preferences, it should also know what it does not know, and integrate this uncertainty as it makes decisions. We advance the state-of-the-art by introducing a Kalman filter for learning from corrections: this approach obtains the uncertainty of the estimated human preferences. Next, we demonstrate how the estimate uncertainty can be leveraged for active learning and risk-sensitive deployment. Our results indicate that obtaining and leveraging uncertainty leads to faster learning from human corrections.
APA
Losey, D.P. & O’Malley, M.K.. (2018). Including Uncertainty when Learning from Human Corrections. Proceedings of The 2nd Conference on Robot Learning, in Proceedings of Machine Learning Research 87:123-132 Available from https://proceedings.mlr.press/v87/losey18a.html.

Related Material