Optimal Budget Allocation for Crowdsourcing Labels for Graphs

Adithya Kulkarni; Mohna Chakraborty; Sihong Xie; Qi Li

Optimal Budget Allocation for Crowdsourcing Labels for Graphs

Adithya Kulkarni, Mohna Chakraborty, Sihong Xie, Qi Li

Proceedings of the Thirty-Ninth Conference on Uncertainty in Artificial Intelligence, PMLR 216:1154-1163, 2023.

Abstract

Crowdsourcing is an effective and efficient paradigm for obtaining labels for unlabeled corpus employing crowd workers. This work considers the budget allocation problem for a generalized setting on a graph of instances to be labeled where edges encode instance dependencies. Specifically, given a graph and a labeling budget, we propose an optimal policy to allocate the budget among the instances to maximize the overall labeling accuracy. We formulate the problem as a Bayesian Markov Decision Process (MDP), where we define our task as an optimization problem that maximizes the overall label accuracy under budget constraints. Then, we propose a novel stage-wise reward function that considers the effect of worker labels on the whole graph at each timestamp. This reward function is utilized to find an optimal policy for the optimization problem. Theoretically, we show that our proposed policies are consistent when the budget is infinite. We conduct extensive experiments on five real-world graph datasets and demonstrate the effectiveness of the proposed policies to achieve a higher label accuracy under budget constraints.

Cite this Paper

BibTeX

@InProceedings{pmlr-v216-kulkarni23a,
  title = 	 {Optimal Budget Allocation for Crowdsourcing Labels for Graphs},
  author =       {Kulkarni, Adithya and Chakraborty, Mohna and Xie, Sihong and Li, Qi},
  booktitle = 	 {Proceedings of the Thirty-Ninth Conference on Uncertainty in Artificial Intelligence},
  pages = 	 {1154--1163},
  year = 	 {2023},
  editor = 	 {Evans, Robin J. and Shpitser, Ilya},
  volume = 	 {216},
  series = 	 {Proceedings of Machine Learning Research},
  month = 	 {31 Jul--04 Aug},
  publisher =    {PMLR},
  pdf = 	 {https://proceedings.mlr.press/v216/kulkarni23a/kulkarni23a.pdf},
  url = 	 {https://proceedings.mlr.press/v216/kulkarni23a.html},
  abstract = 	 {Crowdsourcing is an effective and efficient paradigm for obtaining labels for unlabeled corpus employing crowd workers. This work considers the budget allocation problem for a generalized setting on a graph of instances to be labeled where edges encode instance dependencies. Specifically, given a graph and a labeling budget, we propose an optimal policy to allocate the budget among the instances to maximize the overall labeling accuracy. We formulate the problem as a Bayesian Markov Decision Process (MDP), where we define our task as an optimization problem that maximizes the overall label accuracy under budget constraints. Then, we propose a novel stage-wise reward function that considers the effect of worker labels on the whole graph at each timestamp. This reward function is utilized to find an optimal policy for the optimization problem. Theoretically, we show that our proposed policies are consistent when the budget is infinite. We conduct extensive experiments on five real-world graph datasets and demonstrate the effectiveness of the proposed policies to achieve a higher label accuracy under budget constraints.}
}

Endnote

%0 Conference Paper
%T Optimal Budget Allocation for Crowdsourcing Labels for Graphs
%A Adithya Kulkarni
%A Mohna Chakraborty
%A Sihong Xie
%A Qi Li
%B Proceedings of the Thirty-Ninth Conference on Uncertainty in Artificial Intelligence
%C Proceedings of Machine Learning Research
%D 2023
%E Robin J. Evans
%E Ilya Shpitser	
%F pmlr-v216-kulkarni23a
%I PMLR
%P 1154--1163
%U https://proceedings.mlr.press/v216/kulkarni23a.html
%V 216
%X Crowdsourcing is an effective and efficient paradigm for obtaining labels for unlabeled corpus employing crowd workers. This work considers the budget allocation problem for a generalized setting on a graph of instances to be labeled where edges encode instance dependencies. Specifically, given a graph and a labeling budget, we propose an optimal policy to allocate the budget among the instances to maximize the overall labeling accuracy. We formulate the problem as a Bayesian Markov Decision Process (MDP), where we define our task as an optimization problem that maximizes the overall label accuracy under budget constraints. Then, we propose a novel stage-wise reward function that considers the effect of worker labels on the whole graph at each timestamp. This reward function is utilized to find an optimal policy for the optimization problem. Theoretically, we show that our proposed policies are consistent when the budget is infinite. We conduct extensive experiments on five real-world graph datasets and demonstrate the effectiveness of the proposed policies to achieve a higher label accuracy under budget constraints.

APA

Kulkarni, A., Chakraborty, M., Xie, S. & Li, Q.. (2023). Optimal Budget Allocation for Crowdsourcing Labels for Graphs. Proceedings of the Thirty-Ninth Conference on Uncertainty in Artificial Intelligence, in Proceedings of Machine Learning Research 216:1154-1163 Available from https://proceedings.mlr.press/v216/kulkarni23a.html.

Optimal Budget Allocation for Crowdsourcing Labels for Graphs

Abstract

Cite this Paper

Related Material