Fairness in Reinforcement Learning

Shahin Jabbari; Matthew Joseph; Michael Kearns; Jamie Morgenstern; Aaron Roth

Fairness in Reinforcement Learning

Shahin Jabbari, Matthew Joseph, Michael Kearns, Jamie Morgenstern, Aaron Roth

Proceedings of the 34th International Conference on Machine Learning, PMLR 70:1617-1626, 2017.

Abstract

We initiate the study of fairness in reinforcement learning, where the actions of a learning algorithm may affect its environment and future rewards. Our fairness constraint requires that an algorithm never prefers one action over another if the long-term (discounted) reward of choosing the latter action is higher. Our first result is negative: despite the fact that fairness is consistent with the optimal policy, any learning algorithm satisfying fairness must take time exponential in the number of states to achieve non-trivial approximation to the optimal policy. We then provide a provably fair polynomial time algorithm under an approximate notion of fairness, thus establishing an exponential gap between exact and approximate fairness.

Cite this Paper

BibTeX


@InProceedings{pmlr-v70-jabbari17a,
  title = 	 {Fairness in Reinforcement Learning},
  author =       {Shahin Jabbari and Matthew Joseph and Michael Kearns and Jamie Morgenstern and Aaron Roth},
  booktitle = 	 {Proceedings of the 34th International Conference on Machine Learning},
  pages = 	 {1617--1626},
  year = 	 {2017},
  editor = 	 {Precup, Doina and Teh, Yee Whye},
  volume = 	 {70},
  series = 	 {Proceedings of Machine Learning Research},
  month = 	 {06--11 Aug},
  publisher =    {PMLR},
  pdf = 	 {http://proceedings.mlr.press/v70/jabbari17a/jabbari17a.pdf},
  url = 	 {https://proceedings.mlr.press/v70/jabbari17a.html},
  abstract = 	 {We initiate the study of fairness in reinforcement learning, where the actions of a learning algorithm may affect its environment and future rewards. Our fairness constraint requires that an algorithm never prefers one action over another if the long-term (discounted) reward of choosing the latter action is higher. Our first result is negative: despite the fact that fairness is consistent with the optimal policy, any learning algorithm satisfying fairness must take time exponential in the number of states to achieve non-trivial approximation to the optimal policy. We then provide a provably fair polynomial time algorithm under an approximate notion of fairness, thus establishing an exponential gap between exact and approximate fairness.}
}

Endnote

%0 Conference Paper
%T Fairness in Reinforcement Learning
%A Shahin Jabbari
%A Matthew Joseph
%A Michael Kearns
%A Jamie Morgenstern
%A Aaron Roth
%B Proceedings of the 34th International Conference on Machine Learning
%C Proceedings of Machine Learning Research
%D 2017
%E Doina Precup
%E Yee Whye Teh	
%F pmlr-v70-jabbari17a
%I PMLR
%P 1617--1626
%U https://proceedings.mlr.press/v70/jabbari17a.html
%V 70
%X We initiate the study of fairness in reinforcement learning, where the actions of a learning algorithm may affect its environment and future rewards. Our fairness constraint requires that an algorithm never prefers one action over another if the long-term (discounted) reward of choosing the latter action is higher. Our first result is negative: despite the fact that fairness is consistent with the optimal policy, any learning algorithm satisfying fairness must take time exponential in the number of states to achieve non-trivial approximation to the optimal policy. We then provide a provably fair polynomial time algorithm under an approximate notion of fairness, thus establishing an exponential gap between exact and approximate fairness.

APA


Jabbari, S., Joseph, M., Kearns, M., Morgenstern, J. & Roth, A.. (2017). Fairness in Reinforcement Learning. Proceedings of the 34th International Conference on Machine Learning, in Proceedings of Machine Learning Research 70:1617-1626 Available from https://proceedings.mlr.press/v70/jabbari17a.html.

Fairness in Reinforcement Learning

Abstract

Cite this Paper

Related Material