Large-Scale Markov Decision Problems with KL Control Cost and its Application to Crowdsourcing

Yasin Abbasi-Yadkori; Peter Bartlett; Xi Chen; Alan Malek

Large-Scale Markov Decision Problems with KL Control Cost and its Application to Crowdsourcing

Yasin Abbasi-Yadkori, Peter Bartlett, Xi Chen, Alan Malek

Proceedings of the 32nd International Conference on Machine Learning, PMLR 37:1053-1062, 2015.

Abstract

We study average and total cost Markov decision problems with large state spaces. Since the computational and statistical costs of finding the optimal policy scale with the size of the state space, we focus on searching for near-optimality in a low-dimensional family of policies. In particular, we show that for problems with a Kullback-Leibler divergence cost function, we can reduce policy optimization to a convex optimization and solve it approximately using a stochastic subgradient algorithm. We show that the performance of the resulting policy is close to the best in the low-dimensional family. We demonstrate the efficacy of our approach by controlling the important crowdsourcing application of budget allocation in crowd labeling.

Cite this Paper

BibTeX


@InProceedings{pmlr-v37-abbasi-yadkori15,
  title = 	 {Large-Scale Markov Decision Problems with KL Control Cost and its Application to Crowdsourcing},
  author = 	 {Abbasi-Yadkori, Yasin and Bartlett, Peter and Chen, Xi and Malek, Alan},
  booktitle = 	 {Proceedings of the 32nd International Conference on Machine Learning},
  pages = 	 {1053--1062},
  year = 	 {2015},
  editor = 	 {Bach, Francis and Blei, David},
  volume = 	 {37},
  series = 	 {Proceedings of Machine Learning Research},
  address = 	 {Lille, France},
  month = 	 {07--09 Jul},
  publisher =    {PMLR},
  pdf = 	 {http://proceedings.mlr.press/v37/abbasi-yadkori15.pdf},
  url = 	 {https://proceedings.mlr.press/v37/abbasi-yadkori15.html},
  abstract = 	 {We study average and total cost Markov decision problems with large state spaces. Since the computational and statistical costs of finding the optimal policy scale with the size of the state space, we focus on searching for near-optimality in a low-dimensional family of policies. In particular, we show that for problems with a Kullback-Leibler divergence cost function, we can reduce policy optimization to a convex optimization and solve it approximately using a stochastic subgradient algorithm. We show that the performance of the resulting policy is close to the best in the low-dimensional family. We demonstrate the efficacy of our approach by controlling the important crowdsourcing application of budget allocation in crowd labeling.}
}

Endnote

%0 Conference Paper
%T Large-Scale Markov Decision Problems with KL Control Cost and its Application to Crowdsourcing
%A Yasin Abbasi-Yadkori
%A Peter Bartlett
%A Xi Chen
%A Alan Malek
%B Proceedings of the 32nd International Conference on Machine Learning
%C Proceedings of Machine Learning Research
%D 2015
%E Francis Bach
%E David Blei	
%F pmlr-v37-abbasi-yadkori15
%I PMLR
%P 1053--1062
%U https://proceedings.mlr.press/v37/abbasi-yadkori15.html
%V 37
%X We study average and total cost Markov decision problems with large state spaces. Since the computational and statistical costs of finding the optimal policy scale with the size of the state space, we focus on searching for near-optimality in a low-dimensional family of policies. In particular, we show that for problems with a Kullback-Leibler divergence cost function, we can reduce policy optimization to a convex optimization and solve it approximately using a stochastic subgradient algorithm. We show that the performance of the resulting policy is close to the best in the low-dimensional family. We demonstrate the efficacy of our approach by controlling the important crowdsourcing application of budget allocation in crowd labeling.

RIS


TY  - CPAPER
TI  - Large-Scale Markov Decision Problems with KL Control Cost and its Application to Crowdsourcing
AU  - Yasin Abbasi-Yadkori
AU  - Peter Bartlett
AU  - Xi Chen
AU  - Alan Malek
BT  - Proceedings of the 32nd International Conference on Machine Learning
DA  - 2015/06/01
ED  - Francis Bach
ED  - David Blei	
ID  - pmlr-v37-abbasi-yadkori15
PB  - PMLR
DP  - Proceedings of Machine Learning Research
VL  - 37
SP  - 1053
EP  - 1062
L1  - http://proceedings.mlr.press/v37/abbasi-yadkori15.pdf
UR  - https://proceedings.mlr.press/v37/abbasi-yadkori15.html
AB  - We study average and total cost Markov decision problems with large state spaces. Since the computational and statistical costs of finding the optimal policy scale with the size of the state space, we focus on searching for near-optimality in a low-dimensional family of policies. In particular, we show that for problems with a Kullback-Leibler divergence cost function, we can reduce policy optimization to a convex optimization and solve it approximately using a stochastic subgradient algorithm. We show that the performance of the resulting policy is close to the best in the low-dimensional family. We demonstrate the efficacy of our approach by controlling the important crowdsourcing application of budget allocation in crowd labeling.
ER  -

APA


Abbasi-Yadkori, Y., Bartlett, P., Chen, X. & Malek, A.. (2015). Large-Scale Markov Decision Problems with KL Control Cost and its Application to Crowdsourcing. Proceedings of the 32nd International Conference on Machine Learning, in Proceedings of Machine Learning Research 37:1053-1062 Available from https://proceedings.mlr.press/v37/abbasi-yadkori15.html.

Large-Scale Markov Decision Problems with KL Control Cost and its Application to Crowdsourcing

Abstract

Cite this Paper

Related Material