Ethics in Action: Training Reinforcement Learning Agents for Moral Decision-making In Text-based Adventure Games

Weichen Li, Rati Devidze, Waleed Mustafa, Sophie Fellenz
Proceedings of The 27th International Conference on Artificial Intelligence and Statistics, PMLR 238:1954-1962, 2024.

Abstract

Reinforcement Learning (RL) has demonstrated its potential in solving goal-oriented sequential tasks. However, with the increasing capabilities of RL agents, ensuring morally responsible agent behavior is becoming a pressing concern. Previous approaches have included moral considerations by statically assigning a moral score to each action at runtime. However, these methods do not account for the potential moral value of future states when evaluating immoral actions. This limits the ability to find trade-offs between different aspects of moral behavior and the utility of the action. In this paper, we aim to factor in moral scores by adding a constraint to the RL objective that is incorporated during training, thereby dynamically adapting the policy function. By combining Lagrangian optimization and meta-gradient learning, we develop an RL method that is able to find a trade-off between immoral behavior and performance in the decision-making process.

Cite this Paper


BibTeX
@InProceedings{pmlr-v238-li24i, title = {Ethics in Action: Training Reinforcement Learning Agents for Moral Decision-making In Text-based Adventure Games}, author = {Li, Weichen and Devidze, Rati and Mustafa, Waleed and Fellenz, Sophie}, booktitle = {Proceedings of The 27th International Conference on Artificial Intelligence and Statistics}, pages = {1954--1962}, year = {2024}, editor = {Dasgupta, Sanjoy and Mandt, Stephan and Li, Yingzhen}, volume = {238}, series = {Proceedings of Machine Learning Research}, month = {02--04 May}, publisher = {PMLR}, pdf = {https://proceedings.mlr.press/v238/li24i/li24i.pdf}, url = {https://proceedings.mlr.press/v238/li24i.html}, abstract = {Reinforcement Learning (RL) has demonstrated its potential in solving goal-oriented sequential tasks. However, with the increasing capabilities of RL agents, ensuring morally responsible agent behavior is becoming a pressing concern. Previous approaches have included moral considerations by statically assigning a moral score to each action at runtime. However, these methods do not account for the potential moral value of future states when evaluating immoral actions. This limits the ability to find trade-offs between different aspects of moral behavior and the utility of the action. In this paper, we aim to factor in moral scores by adding a constraint to the RL objective that is incorporated during training, thereby dynamically adapting the policy function. By combining Lagrangian optimization and meta-gradient learning, we develop an RL method that is able to find a trade-off between immoral behavior and performance in the decision-making process.} }
Endnote
%0 Conference Paper %T Ethics in Action: Training Reinforcement Learning Agents for Moral Decision-making In Text-based Adventure Games %A Weichen Li %A Rati Devidze %A Waleed Mustafa %A Sophie Fellenz %B Proceedings of The 27th International Conference on Artificial Intelligence and Statistics %C Proceedings of Machine Learning Research %D 2024 %E Sanjoy Dasgupta %E Stephan Mandt %E Yingzhen Li %F pmlr-v238-li24i %I PMLR %P 1954--1962 %U https://proceedings.mlr.press/v238/li24i.html %V 238 %X Reinforcement Learning (RL) has demonstrated its potential in solving goal-oriented sequential tasks. However, with the increasing capabilities of RL agents, ensuring morally responsible agent behavior is becoming a pressing concern. Previous approaches have included moral considerations by statically assigning a moral score to each action at runtime. However, these methods do not account for the potential moral value of future states when evaluating immoral actions. This limits the ability to find trade-offs between different aspects of moral behavior and the utility of the action. In this paper, we aim to factor in moral scores by adding a constraint to the RL objective that is incorporated during training, thereby dynamically adapting the policy function. By combining Lagrangian optimization and meta-gradient learning, we develop an RL method that is able to find a trade-off between immoral behavior and performance in the decision-making process.
APA
Li, W., Devidze, R., Mustafa, W. & Fellenz, S.. (2024). Ethics in Action: Training Reinforcement Learning Agents for Moral Decision-making In Text-based Adventure Games. Proceedings of The 27th International Conference on Artificial Intelligence and Statistics, in Proceedings of Machine Learning Research 238:1954-1962 Available from https://proceedings.mlr.press/v238/li24i.html.

Related Material