Theory of Mind with Guilt Aversion Facilitates Cooperative Reinforcement Learning

Dung Nguyen; Svetha Venkatesh; Phuoc Nguyen; Truyen Tran

Theory of Mind with Guilt Aversion Facilitates Cooperative Reinforcement Learning

Dung Nguyen, Svetha Venkatesh, Phuoc Nguyen, Truyen Tran

Proceedings of The 12th Asian Conference on Machine Learning, PMLR 129:33-48, 2020.

Abstract

Guilt aversion induces experience of a utility loss in people if they believe they have disappointed others, and this promotes cooperative behaviour in human. In psychological game theory, guilt aversion necessitates modelling of agents that have theory about what other agents think, also known as Theory of Mind (ToM). We aim to build a new kind of affective reinforcement learning agents, called Theory of Mind Agents with Guilt Aversion (ToMAGA), which are equipped with an ability to think about the wellbeing of others instead of just self-interest. To validate the agent design, we use a general-sum game known as Stag Hunt as a test bed. As standard reinforcement learning agents could learn suboptimal policies in social dilemmas like Stag Hunt, we propose to use belief-based guilt aversion as a reward shaping mechanism. We show that our belief-based guilt averse agents can efficiently learn cooperative behaviours in Stag Hunt Games.

Cite this Paper

BibTeX

@InProceedings{pmlr-v129-nguyen20a,
  title = 	 {Theory of Mind with Guilt Aversion Facilitates Cooperative Reinforcement Learning},
  author =       {Nguyen, Dung and Venkatesh, Svetha and Nguyen, Phuoc and Tran, Truyen},
  booktitle = 	 {Proceedings of The 12th Asian Conference on Machine Learning},
  pages = 	 {33--48},
  year = 	 {2020},
  editor = 	 {Pan, Sinno Jialin and Sugiyama, Masashi},
  volume = 	 {129},
  series = 	 {Proceedings of Machine Learning Research},
  month = 	 {18--20 Nov},
  publisher =    {PMLR},
  pdf = 	 {http://proceedings.mlr.press/v129/nguyen20a/nguyen20a.pdf},
  url = 	 {https://proceedings.mlr.press/v129/nguyen20a.html},
  abstract = 	 {Guilt aversion induces experience of a utility loss in people if they believe they have disappointed others, and this promotes cooperative behaviour in human. In psychological game theory, guilt aversion necessitates modelling of agents that have theory about what other agents think, also known as Theory of Mind (ToM). We aim to build a new kind of affective reinforcement learning agents, called Theory of Mind Agents with Guilt Aversion (ToMAGA), which are equipped with an ability to think about the wellbeing of others instead of just self-interest. To validate the agent design, we use a general-sum game known as Stag Hunt as a test bed. As standard reinforcement learning agents could learn suboptimal policies in social dilemmas like Stag Hunt, we propose to use belief-based guilt aversion as a reward shaping mechanism. We show that our belief-based guilt averse agents can efficiently learn cooperative behaviours in Stag Hunt Games.}
}

Endnote

%0 Conference Paper
%T Theory of Mind with Guilt Aversion Facilitates Cooperative Reinforcement Learning
%A Dung Nguyen
%A Svetha Venkatesh
%A Phuoc Nguyen
%A Truyen Tran
%B Proceedings of The 12th Asian Conference on Machine Learning
%C Proceedings of Machine Learning Research
%D 2020
%E Sinno Jialin Pan
%E Masashi Sugiyama	
%F pmlr-v129-nguyen20a
%I PMLR
%P 33--48
%U https://proceedings.mlr.press/v129/nguyen20a.html
%V 129
%X Guilt aversion induces experience of a utility loss in people if they believe they have disappointed others, and this promotes cooperative behaviour in human. In psychological game theory, guilt aversion necessitates modelling of agents that have theory about what other agents think, also known as Theory of Mind (ToM). We aim to build a new kind of affective reinforcement learning agents, called Theory of Mind Agents with Guilt Aversion (ToMAGA), which are equipped with an ability to think about the wellbeing of others instead of just self-interest. To validate the agent design, we use a general-sum game known as Stag Hunt as a test bed. As standard reinforcement learning agents could learn suboptimal policies in social dilemmas like Stag Hunt, we propose to use belief-based guilt aversion as a reward shaping mechanism. We show that our belief-based guilt averse agents can efficiently learn cooperative behaviours in Stag Hunt Games.

APA

Nguyen, D., Venkatesh, S., Nguyen, P. & Tran, T.. (2020). Theory of Mind with Guilt Aversion Facilitates Cooperative Reinforcement Learning. Proceedings of The 12th Asian Conference on Machine Learning, in Proceedings of Machine Learning Research 129:33-48 Available from https://proceedings.mlr.press/v129/nguyen20a.html.

Theory of Mind with Guilt Aversion Facilitates Cooperative Reinforcement Learning

Abstract

Cite this Paper

Related Material