Provably Efficient Third-Person Imitation from Offline Observation

Aaron Zweig, Joan Bruna
Proceedings of the 36th Conference on Uncertainty in Artificial Intelligence (UAI), PMLR 124:1228-1237, 2020.

Abstract

Domain adaptation in imitation learning represents an essential step towards improving generalizability. However, even in the restricted setting of third-person imitation where transfer is between isomorphic Markov Decision Processes, there are no strong guarantees on the performance of transferred policies. We present problem-dependent, statistical learning guarantees for third-person imitation from observation in an offline setting, and a lower bound on performance in the online setting.

Cite this Paper


BibTeX
@InProceedings{pmlr-v124-zweig20a, title = {Provably Efficient Third-Person Imitation from Offline Observation}, author = {Zweig, Aaron and Bruna, Joan}, booktitle = {Proceedings of the 36th Conference on Uncertainty in Artificial Intelligence (UAI)}, pages = {1228--1237}, year = {2020}, editor = {Jonas Peters and David Sontag}, volume = {124}, series = {Proceedings of Machine Learning Research}, month = {03--06 Aug}, publisher = {PMLR}, pdf = {http://proceedings.mlr.press/v124/zweig20a/zweig20a.pdf}, url = { http://proceedings.mlr.press/v124/zweig20a.html }, abstract = {Domain adaptation in imitation learning represents an essential step towards improving generalizability. However, even in the restricted setting of third-person imitation where transfer is between isomorphic Markov Decision Processes, there are no strong guarantees on the performance of transferred policies. We present problem-dependent, statistical learning guarantees for third-person imitation from observation in an offline setting, and a lower bound on performance in the online setting.} }
Endnote
%0 Conference Paper %T Provably Efficient Third-Person Imitation from Offline Observation %A Aaron Zweig %A Joan Bruna %B Proceedings of the 36th Conference on Uncertainty in Artificial Intelligence (UAI) %C Proceedings of Machine Learning Research %D 2020 %E Jonas Peters %E David Sontag %F pmlr-v124-zweig20a %I PMLR %P 1228--1237 %U http://proceedings.mlr.press/v124/zweig20a.html %V 124 %X Domain adaptation in imitation learning represents an essential step towards improving generalizability. However, even in the restricted setting of third-person imitation where transfer is between isomorphic Markov Decision Processes, there are no strong guarantees on the performance of transferred policies. We present problem-dependent, statistical learning guarantees for third-person imitation from observation in an offline setting, and a lower bound on performance in the online setting.
APA
Zweig, A. & Bruna, J.. (2020). Provably Efficient Third-Person Imitation from Offline Observation. Proceedings of the 36th Conference on Uncertainty in Artificial Intelligence (UAI), in Proceedings of Machine Learning Research 124:1228-1237 Available from http://proceedings.mlr.press/v124/zweig20a.html .

Related Material