Learning Backchanneling Behaviors for a Social Robot via Data Augmentation from Human-Human Conversations

Michael Murray, Nick Walker, Amal Nanavati, Patricia Alves-Oliveira, Nikita Filippov, Allison Sauppe, Bilge Mutlu, Maya Cakmak
Proceedings of the 5th Conference on Robot Learning, PMLR 164:513-525, 2022.

Abstract

Backchanneling behaviors on a robot, such as nodding, can make talking to a robot feel more natural and engaging by giving a sense that the robot is actively listening. For backchanneling to be effective, it is important that the timing of such cues is appropriate given the humans’ conversational behaviors. Recent progress has shown that these behaviors can be learned from datasets of human-human conversations. However, recent data-driven methods tend to overfit to the human speakers that are seen in training data and fail to generalize well to previously unseen speakers. In this paper, we explore the use of data augmentation for effective nodding behavior in a robot. We show that, by augmenting the input speech and visual features, we can produce data-driven models that are more robust to unseen features without collecting additional data. We analyze the efficacy of data-driven backchanneling in a realistic human-robot conversational setting with a user study, showing that users perceived the data-driven model to be better at listening as compared to rule-based and random baselines.

Cite this Paper


BibTeX
@InProceedings{pmlr-v164-murray22a, title = {Learning Backchanneling Behaviors for a Social Robot via Data Augmentation from Human-Human Conversations}, author = {Murray, Michael and Walker, Nick and Nanavati, Amal and Alves-Oliveira, Patricia and Filippov, Nikita and Sauppe, Allison and Mutlu, Bilge and Cakmak, Maya}, booktitle = {Proceedings of the 5th Conference on Robot Learning}, pages = {513--525}, year = {2022}, editor = {Faust, Aleksandra and Hsu, David and Neumann, Gerhard}, volume = {164}, series = {Proceedings of Machine Learning Research}, month = {08--11 Nov}, publisher = {PMLR}, pdf = {https://proceedings.mlr.press/v164/murray22a/murray22a.pdf}, url = {https://proceedings.mlr.press/v164/murray22a.html}, abstract = {Backchanneling behaviors on a robot, such as nodding, can make talking to a robot feel more natural and engaging by giving a sense that the robot is actively listening. For backchanneling to be effective, it is important that the timing of such cues is appropriate given the humans’ conversational behaviors. Recent progress has shown that these behaviors can be learned from datasets of human-human conversations. However, recent data-driven methods tend to overfit to the human speakers that are seen in training data and fail to generalize well to previously unseen speakers. In this paper, we explore the use of data augmentation for effective nodding behavior in a robot. We show that, by augmenting the input speech and visual features, we can produce data-driven models that are more robust to unseen features without collecting additional data. We analyze the efficacy of data-driven backchanneling in a realistic human-robot conversational setting with a user study, showing that users perceived the data-driven model to be better at listening as compared to rule-based and random baselines.} }
Endnote
%0 Conference Paper %T Learning Backchanneling Behaviors for a Social Robot via Data Augmentation from Human-Human Conversations %A Michael Murray %A Nick Walker %A Amal Nanavati %A Patricia Alves-Oliveira %A Nikita Filippov %A Allison Sauppe %A Bilge Mutlu %A Maya Cakmak %B Proceedings of the 5th Conference on Robot Learning %C Proceedings of Machine Learning Research %D 2022 %E Aleksandra Faust %E David Hsu %E Gerhard Neumann %F pmlr-v164-murray22a %I PMLR %P 513--525 %U https://proceedings.mlr.press/v164/murray22a.html %V 164 %X Backchanneling behaviors on a robot, such as nodding, can make talking to a robot feel more natural and engaging by giving a sense that the robot is actively listening. For backchanneling to be effective, it is important that the timing of such cues is appropriate given the humans’ conversational behaviors. Recent progress has shown that these behaviors can be learned from datasets of human-human conversations. However, recent data-driven methods tend to overfit to the human speakers that are seen in training data and fail to generalize well to previously unseen speakers. In this paper, we explore the use of data augmentation for effective nodding behavior in a robot. We show that, by augmenting the input speech and visual features, we can produce data-driven models that are more robust to unseen features without collecting additional data. We analyze the efficacy of data-driven backchanneling in a realistic human-robot conversational setting with a user study, showing that users perceived the data-driven model to be better at listening as compared to rule-based and random baselines.
APA
Murray, M., Walker, N., Nanavati, A., Alves-Oliveira, P., Filippov, N., Sauppe, A., Mutlu, B. & Cakmak, M.. (2022). Learning Backchanneling Behaviors for a Social Robot via Data Augmentation from Human-Human Conversations. Proceedings of the 5th Conference on Robot Learning, in Proceedings of Machine Learning Research 164:513-525 Available from https://proceedings.mlr.press/v164/murray22a.html.

Related Material