Learning Representations that Enable Generalization in Assistive Tasks

Jerry Zhi-Yang He, Zackory Erickson, Daniel S. Brown, Aditi Raghunathan, Anca Dragan
Proceedings of The 6th Conference on Robot Learning, PMLR 205:2105-2114, 2023.

Abstract

Recent work in sim2real has successfully enabled robots to act in physical environments by training in simulation with a diverse “population” of environments (i.e. domain randomization). In this work, we focus on enabling generalization in \emph{assistive tasks}: tasks in which the robot is acting to assist a user (e.g. helping someone with motor impairments with bathing or with scratching an itch). Such tasks are particularly interesting relative to prior sim2real successes because the environment now contains a \emph{human who is also acting}. This complicates the problem because the diversity of human users (instead of merely physical environment parameters) is more difficult to capture in a population, thus increasing the likelihood of encountering out-of-distribution (OOD) human policies at test time. We advocate that generalization to such OOD policies benefits from (1) learning a good latent representation for human policies that test-time humans can accurately be mapped to, and (2) making that representation adaptable with test-time interaction data, instead of relying on it to perfectly capture the space of human policies based on the simulated population only. We study how to best learn such a representation by evaluating on purposefully constructed OOD test policies. We find that sim2real methods that encode environment (or population) parameters and work well in tasks that robots do in isolation, do not work well in \emph{assistance}. In assistance, it seems crucial to train the representation based on the \emph{history of interaction} directly, because that is what the robot will have access to at test time. Further, training these representations to then \emph{predict human actions} not only gives them better structure, but also enables them to be fine-tuned at test-time, when the robot observes the partner act.

Cite this Paper


BibTeX
@InProceedings{pmlr-v205-he23a, title = {Learning Representations that Enable Generalization in Assistive Tasks}, author = {He, Jerry Zhi-Yang and Erickson, Zackory and Brown, Daniel S. and Raghunathan, Aditi and Dragan, Anca}, booktitle = {Proceedings of The 6th Conference on Robot Learning}, pages = {2105--2114}, year = {2023}, editor = {Liu, Karen and Kulic, Dana and Ichnowski, Jeff}, volume = {205}, series = {Proceedings of Machine Learning Research}, month = {14--18 Dec}, publisher = {PMLR}, pdf = {https://proceedings.mlr.press/v205/he23a/he23a.pdf}, url = {https://proceedings.mlr.press/v205/he23a.html}, abstract = {Recent work in sim2real has successfully enabled robots to act in physical environments by training in simulation with a diverse “population” of environments (i.e. domain randomization). In this work, we focus on enabling generalization in \emph{assistive tasks}: tasks in which the robot is acting to assist a user (e.g. helping someone with motor impairments with bathing or with scratching an itch). Such tasks are particularly interesting relative to prior sim2real successes because the environment now contains a \emph{human who is also acting}. This complicates the problem because the diversity of human users (instead of merely physical environment parameters) is more difficult to capture in a population, thus increasing the likelihood of encountering out-of-distribution (OOD) human policies at test time. We advocate that generalization to such OOD policies benefits from (1) learning a good latent representation for human policies that test-time humans can accurately be mapped to, and (2) making that representation adaptable with test-time interaction data, instead of relying on it to perfectly capture the space of human policies based on the simulated population only. We study how to best learn such a representation by evaluating on purposefully constructed OOD test policies. We find that sim2real methods that encode environment (or population) parameters and work well in tasks that robots do in isolation, do not work well in \emph{assistance}. In assistance, it seems crucial to train the representation based on the \emph{history of interaction} directly, because that is what the robot will have access to at test time. Further, training these representations to then \emph{predict human actions} not only gives them better structure, but also enables them to be fine-tuned at test-time, when the robot observes the partner act.} }
Endnote
%0 Conference Paper %T Learning Representations that Enable Generalization in Assistive Tasks %A Jerry Zhi-Yang He %A Zackory Erickson %A Daniel S. Brown %A Aditi Raghunathan %A Anca Dragan %B Proceedings of The 6th Conference on Robot Learning %C Proceedings of Machine Learning Research %D 2023 %E Karen Liu %E Dana Kulic %E Jeff Ichnowski %F pmlr-v205-he23a %I PMLR %P 2105--2114 %U https://proceedings.mlr.press/v205/he23a.html %V 205 %X Recent work in sim2real has successfully enabled robots to act in physical environments by training in simulation with a diverse “population” of environments (i.e. domain randomization). In this work, we focus on enabling generalization in \emph{assistive tasks}: tasks in which the robot is acting to assist a user (e.g. helping someone with motor impairments with bathing or with scratching an itch). Such tasks are particularly interesting relative to prior sim2real successes because the environment now contains a \emph{human who is also acting}. This complicates the problem because the diversity of human users (instead of merely physical environment parameters) is more difficult to capture in a population, thus increasing the likelihood of encountering out-of-distribution (OOD) human policies at test time. We advocate that generalization to such OOD policies benefits from (1) learning a good latent representation for human policies that test-time humans can accurately be mapped to, and (2) making that representation adaptable with test-time interaction data, instead of relying on it to perfectly capture the space of human policies based on the simulated population only. We study how to best learn such a representation by evaluating on purposefully constructed OOD test policies. We find that sim2real methods that encode environment (or population) parameters and work well in tasks that robots do in isolation, do not work well in \emph{assistance}. In assistance, it seems crucial to train the representation based on the \emph{history of interaction} directly, because that is what the robot will have access to at test time. Further, training these representations to then \emph{predict human actions} not only gives them better structure, but also enables them to be fine-tuned at test-time, when the robot observes the partner act.
APA
He, J.Z., Erickson, Z., Brown, D.S., Raghunathan, A. & Dragan, A.. (2023). Learning Representations that Enable Generalization in Assistive Tasks. Proceedings of The 6th Conference on Robot Learning, in Proceedings of Machine Learning Research 205:2105-2114 Available from https://proceedings.mlr.press/v205/he23a.html.

Related Material