Fairness in Reinforcement Learning

Shahin Jabbari, Matthew Joseph, Michael Kearns, Jamie Morgenstern, Aaron Roth
Proceedings of the 34th International Conference on Machine Learning, PMLR 70:1617-1626, 2017.

Abstract

We initiate the study of fairness in reinforcement learning, where the actions of a learning algorithm may affect its environment and future rewards. Our fairness constraint requires that an algorithm never prefers one action over another if the long-term (discounted) reward of choosing the latter action is higher. Our first result is negative: despite the fact that fairness is consistent with the optimal policy, any learning algorithm satisfying fairness must take time exponential in the number of states to achieve non-trivial approximation to the optimal policy. We then provide a provably fair polynomial time algorithm under an approximate notion of fairness, thus establishing an exponential gap between exact and approximate fairness.

Cite this Paper


BibTeX
@InProceedings{pmlr-v70-jabbari17a, title = {Fairness in Reinforcement Learning}, author = {Shahin Jabbari and Matthew Joseph and Michael Kearns and Jamie Morgenstern and Aaron Roth}, booktitle = {Proceedings of the 34th International Conference on Machine Learning}, pages = {1617--1626}, year = {2017}, editor = {Precup, Doina and Teh, Yee Whye}, volume = {70}, series = {Proceedings of Machine Learning Research}, month = {06--11 Aug}, publisher = {PMLR}, pdf = {http://proceedings.mlr.press/v70/jabbari17a/jabbari17a.pdf}, url = {https://proceedings.mlr.press/v70/jabbari17a.html}, abstract = {We initiate the study of fairness in reinforcement learning, where the actions of a learning algorithm may affect its environment and future rewards. Our fairness constraint requires that an algorithm never prefers one action over another if the long-term (discounted) reward of choosing the latter action is higher. Our first result is negative: despite the fact that fairness is consistent with the optimal policy, any learning algorithm satisfying fairness must take time exponential in the number of states to achieve non-trivial approximation to the optimal policy. We then provide a provably fair polynomial time algorithm under an approximate notion of fairness, thus establishing an exponential gap between exact and approximate fairness.} }
Endnote
%0 Conference Paper %T Fairness in Reinforcement Learning %A Shahin Jabbari %A Matthew Joseph %A Michael Kearns %A Jamie Morgenstern %A Aaron Roth %B Proceedings of the 34th International Conference on Machine Learning %C Proceedings of Machine Learning Research %D 2017 %E Doina Precup %E Yee Whye Teh %F pmlr-v70-jabbari17a %I PMLR %P 1617--1626 %U https://proceedings.mlr.press/v70/jabbari17a.html %V 70 %X We initiate the study of fairness in reinforcement learning, where the actions of a learning algorithm may affect its environment and future rewards. Our fairness constraint requires that an algorithm never prefers one action over another if the long-term (discounted) reward of choosing the latter action is higher. Our first result is negative: despite the fact that fairness is consistent with the optimal policy, any learning algorithm satisfying fairness must take time exponential in the number of states to achieve non-trivial approximation to the optimal policy. We then provide a provably fair polynomial time algorithm under an approximate notion of fairness, thus establishing an exponential gap between exact and approximate fairness.
APA
Jabbari, S., Joseph, M., Kearns, M., Morgenstern, J. & Roth, A.. (2017). Fairness in Reinforcement Learning. Proceedings of the 34th International Conference on Machine Learning, in Proceedings of Machine Learning Research 70:1617-1626 Available from https://proceedings.mlr.press/v70/jabbari17a.html.

Related Material