Group Fairness in Predict-Then-Optimize Settings for Restless Bandits

Shresth Verma, Yunfan Zhao, Sanket Shah, Niclas Boehmer, Aparna Taneja, Milind Tambe
Proceedings of the Fortieth Conference on Uncertainty in Artificial Intelligence, PMLR 244:3448-3469, 2024.

Abstract

Restless multi-arm bandits (RMABs) are a model for sequentially allocating a limited number of resources to agents modeled as Markov Decision Processes. RMABs have applications in cellular networks, anti-poaching, and in particular, healthcare. For such high-stakes use cases, allocations are often required to treat different groups of agents (e.g., defined by sensitive attributes) fairly. In addition to the fairness challenge, agents’ transition probabilities are often unknown and need to be learned in real-world problems. Thus, group fairness in RMABs requires us to simultaneously learn transition probabilities and how much budget we allocate to each group. Overcoming this key challenge ignored by previous work, we develop a decision-focused-learning pipeline to solve equitable RMABs, using a novel budget allocation algorithm to prevent disparity between groups. Our results on both synthetic and real-world large-scale datasets demonstrate that incorporating fair planning into the learning step greatly improves equity with little sacrifice in utility.

Cite this Paper


BibTeX
@InProceedings{pmlr-v244-verma24a, title = {Group Fairness in Predict-Then-Optimize Settings for Restless Bandits}, author = {Verma, Shresth and Zhao, Yunfan and Shah, Sanket and Boehmer, Niclas and Taneja, Aparna and Tambe, Milind}, booktitle = {Proceedings of the Fortieth Conference on Uncertainty in Artificial Intelligence}, pages = {3448--3469}, year = {2024}, editor = {Kiyavash, Negar and Mooij, Joris M.}, volume = {244}, series = {Proceedings of Machine Learning Research}, month = {15--19 Jul}, publisher = {PMLR}, pdf = {https://raw.githubusercontent.com/mlresearch/v244/main/assets/verma24a/verma24a.pdf}, url = {https://proceedings.mlr.press/v244/verma24a.html}, abstract = {Restless multi-arm bandits (RMABs) are a model for sequentially allocating a limited number of resources to agents modeled as Markov Decision Processes. RMABs have applications in cellular networks, anti-poaching, and in particular, healthcare. For such high-stakes use cases, allocations are often required to treat different groups of agents (e.g., defined by sensitive attributes) fairly. In addition to the fairness challenge, agents’ transition probabilities are often unknown and need to be learned in real-world problems. Thus, group fairness in RMABs requires us to simultaneously learn transition probabilities and how much budget we allocate to each group. Overcoming this key challenge ignored by previous work, we develop a decision-focused-learning pipeline to solve equitable RMABs, using a novel budget allocation algorithm to prevent disparity between groups. Our results on both synthetic and real-world large-scale datasets demonstrate that incorporating fair planning into the learning step greatly improves equity with little sacrifice in utility.} }
Endnote
%0 Conference Paper %T Group Fairness in Predict-Then-Optimize Settings for Restless Bandits %A Shresth Verma %A Yunfan Zhao %A Sanket Shah %A Niclas Boehmer %A Aparna Taneja %A Milind Tambe %B Proceedings of the Fortieth Conference on Uncertainty in Artificial Intelligence %C Proceedings of Machine Learning Research %D 2024 %E Negar Kiyavash %E Joris M. Mooij %F pmlr-v244-verma24a %I PMLR %P 3448--3469 %U https://proceedings.mlr.press/v244/verma24a.html %V 244 %X Restless multi-arm bandits (RMABs) are a model for sequentially allocating a limited number of resources to agents modeled as Markov Decision Processes. RMABs have applications in cellular networks, anti-poaching, and in particular, healthcare. For such high-stakes use cases, allocations are often required to treat different groups of agents (e.g., defined by sensitive attributes) fairly. In addition to the fairness challenge, agents’ transition probabilities are often unknown and need to be learned in real-world problems. Thus, group fairness in RMABs requires us to simultaneously learn transition probabilities and how much budget we allocate to each group. Overcoming this key challenge ignored by previous work, we develop a decision-focused-learning pipeline to solve equitable RMABs, using a novel budget allocation algorithm to prevent disparity between groups. Our results on both synthetic and real-world large-scale datasets demonstrate that incorporating fair planning into the learning step greatly improves equity with little sacrifice in utility.
APA
Verma, S., Zhao, Y., Shah, S., Boehmer, N., Taneja, A. & Tambe, M.. (2024). Group Fairness in Predict-Then-Optimize Settings for Restless Bandits. Proceedings of the Fortieth Conference on Uncertainty in Artificial Intelligence, in Proceedings of Machine Learning Research 244:3448-3469 Available from https://proceedings.mlr.press/v244/verma24a.html.

Related Material