Targeted Data Acquisition for Evolving Negotiation Agents

Minae Kwon, Siddharth Karamcheti, Mariano-Florentino Cuellar, Dorsa Sadigh
Proceedings of the 38th International Conference on Machine Learning, PMLR 139:5894-5904, 2021.

Abstract

Successful negotiators must learn how to balance optimizing for self-interest and cooperation. Yet current artificial negotiation agents often heavily depend on the quality of the static datasets they were trained on, limiting their capacity to fashion an adaptive response balancing self-interest and cooperation. For this reason, we find that these agents can achieve either high utility or cooperation, but not both. To address this, we introduce a targeted data acquisition framework where we guide the exploration of a reinforcement learning agent using annotations from an expert oracle. The guided exploration incentivizes the learning agent to go beyond its static dataset and develop new negotiation strategies. We show that this enables our agents to obtain higher-reward and more Pareto-optimal solutions when negotiating with both simulated and human partners compared to standard supervised learning and reinforcement learning methods. This trend additionally holds when comparing agents using our targeted data acquisition framework to variants of agents trained with a mix of supervised learning and reinforcement learning, or to agents using tailored reward functions that explicitly optimize for utility and Pareto-optimality.

Cite this Paper


BibTeX
@InProceedings{pmlr-v139-kwon21a, title = {Targeted Data Acquisition for Evolving Negotiation Agents}, author = {Kwon, Minae and Karamcheti, Siddharth and Cuellar, Mariano-Florentino and Sadigh, Dorsa}, booktitle = {Proceedings of the 38th International Conference on Machine Learning}, pages = {5894--5904}, year = {2021}, editor = {Meila, Marina and Zhang, Tong}, volume = {139}, series = {Proceedings of Machine Learning Research}, month = {18--24 Jul}, publisher = {PMLR}, pdf = {http://proceedings.mlr.press/v139/kwon21a/kwon21a.pdf}, url = {https://proceedings.mlr.press/v139/kwon21a.html}, abstract = {Successful negotiators must learn how to balance optimizing for self-interest and cooperation. Yet current artificial negotiation agents often heavily depend on the quality of the static datasets they were trained on, limiting their capacity to fashion an adaptive response balancing self-interest and cooperation. For this reason, we find that these agents can achieve either high utility or cooperation, but not both. To address this, we introduce a targeted data acquisition framework where we guide the exploration of a reinforcement learning agent using annotations from an expert oracle. The guided exploration incentivizes the learning agent to go beyond its static dataset and develop new negotiation strategies. We show that this enables our agents to obtain higher-reward and more Pareto-optimal solutions when negotiating with both simulated and human partners compared to standard supervised learning and reinforcement learning methods. This trend additionally holds when comparing agents using our targeted data acquisition framework to variants of agents trained with a mix of supervised learning and reinforcement learning, or to agents using tailored reward functions that explicitly optimize for utility and Pareto-optimality.} }
Endnote
%0 Conference Paper %T Targeted Data Acquisition for Evolving Negotiation Agents %A Minae Kwon %A Siddharth Karamcheti %A Mariano-Florentino Cuellar %A Dorsa Sadigh %B Proceedings of the 38th International Conference on Machine Learning %C Proceedings of Machine Learning Research %D 2021 %E Marina Meila %E Tong Zhang %F pmlr-v139-kwon21a %I PMLR %P 5894--5904 %U https://proceedings.mlr.press/v139/kwon21a.html %V 139 %X Successful negotiators must learn how to balance optimizing for self-interest and cooperation. Yet current artificial negotiation agents often heavily depend on the quality of the static datasets they were trained on, limiting their capacity to fashion an adaptive response balancing self-interest and cooperation. For this reason, we find that these agents can achieve either high utility or cooperation, but not both. To address this, we introduce a targeted data acquisition framework where we guide the exploration of a reinforcement learning agent using annotations from an expert oracle. The guided exploration incentivizes the learning agent to go beyond its static dataset and develop new negotiation strategies. We show that this enables our agents to obtain higher-reward and more Pareto-optimal solutions when negotiating with both simulated and human partners compared to standard supervised learning and reinforcement learning methods. This trend additionally holds when comparing agents using our targeted data acquisition framework to variants of agents trained with a mix of supervised learning and reinforcement learning, or to agents using tailored reward functions that explicitly optimize for utility and Pareto-optimality.
APA
Kwon, M., Karamcheti, S., Cuellar, M. & Sadigh, D.. (2021). Targeted Data Acquisition for Evolving Negotiation Agents. Proceedings of the 38th International Conference on Machine Learning, in Proceedings of Machine Learning Research 139:5894-5904 Available from https://proceedings.mlr.press/v139/kwon21a.html.

Related Material