Learning Policies for Contextual Submodular Prediction

Stephane Ross; Jiaji Zhou; Yisong Yue; Debadeepta Dey; Drew Bagnell

Learning Policies for Contextual Submodular Prediction

Stephane Ross, Jiaji Zhou, Yisong Yue, Debadeepta Dey, Drew Bagnell

Proceedings of the 30th International Conference on Machine Learning, PMLR 28(3):1364-1372, 2013.

Abstract

Many prediction domains, such as ad placement, recommendation, trajectory prediction, and document summarization, require predicting a set or list of options. Such lists are often evaluated using submodular reward functions that measure both quality and diversity. We propose a simple, efficient, and provably near-optimal approach to optimizing such prediction problems based on no-regret learning. Our method leverages a surprising result from online submodular optimization: a single no-regret online learner can compete with an optimal sequence of predictions. Compared to previous work, which either learn a sequence of classifiers or rely on stronger assumptions such as realizability, we ensure both data-efficiency as well as performance guarantees in the fully agnostic setting. Experiments validate the efficiency and applicability of the approach on a wide range of problems including manipulator trajectory optimization, news recommendation and document summarization.

Cite this Paper

BibTeX


@InProceedings{pmlr-v28-ross13b,
  title = 	 {Learning Policies for Contextual Submodular Prediction},
  author = 	 {Ross, Stephane and Zhou, Jiaji and Yue, Yisong and Dey, Debadeepta and Bagnell, Drew},
  booktitle = 	 {Proceedings of the 30th International Conference on Machine Learning},
  pages = 	 {1364--1372},
  year = 	 {2013},
  editor = 	 {Dasgupta, Sanjoy and McAllester, David},
  volume = 	 {28},
  number =       {3},
  series = 	 {Proceedings of Machine Learning Research},
  address = 	 {Atlanta, Georgia, USA},
  month = 	 {17--19 Jun},
  publisher =    {PMLR},
  pdf = 	 {http://proceedings.mlr.press/v28/ross13b.pdf},
  url = 	 {https://proceedings.mlr.press/v28/ross13b.html},
  abstract = 	 {Many prediction domains, such as ad placement, recommendation, trajectory prediction, and document summarization, require predicting a set or list of options. Such lists are often evaluated using submodular reward functions that measure both quality and diversity. We propose a simple, efficient, and provably near-optimal approach to optimizing such prediction problems based on no-regret learning. Our method leverages a surprising result from online submodular optimization: a single no-regret online learner can compete with an optimal sequence of predictions. Compared to previous work, which either learn a sequence of classifiers or rely on stronger assumptions such as realizability, we ensure both data-efficiency as well as performance guarantees in the fully agnostic setting. Experiments validate the efficiency and applicability of the approach on a wide range of problems including manipulator trajectory optimization, news recommendation and document summarization.}
}

Endnote

%0 Conference Paper
%T Learning Policies for Contextual Submodular Prediction
%A Stephane Ross
%A Jiaji Zhou
%A Yisong Yue
%A Debadeepta Dey
%A Drew Bagnell
%B Proceedings of the 30th International Conference on Machine Learning
%C Proceedings of Machine Learning Research
%D 2013
%E Sanjoy Dasgupta
%E David McAllester	
%F pmlr-v28-ross13b
%I PMLR
%P 1364--1372
%U https://proceedings.mlr.press/v28/ross13b.html
%V 28
%N 3
%X Many prediction domains, such as ad placement, recommendation, trajectory prediction, and document summarization, require predicting a set or list of options. Such lists are often evaluated using submodular reward functions that measure both quality and diversity. We propose a simple, efficient, and provably near-optimal approach to optimizing such prediction problems based on no-regret learning. Our method leverages a surprising result from online submodular optimization: a single no-regret online learner can compete with an optimal sequence of predictions. Compared to previous work, which either learn a sequence of classifiers or rely on stronger assumptions such as realizability, we ensure both data-efficiency as well as performance guarantees in the fully agnostic setting. Experiments validate the efficiency and applicability of the approach on a wide range of problems including manipulator trajectory optimization, news recommendation and document summarization.

RIS


TY  - CPAPER
TI  - Learning Policies for Contextual Submodular Prediction
AU  - Stephane Ross
AU  - Jiaji Zhou
AU  - Yisong Yue
AU  - Debadeepta Dey
AU  - Drew Bagnell
BT  - Proceedings of the 30th International Conference on Machine Learning
DA  - 2013/05/26
ED  - Sanjoy Dasgupta
ED  - David McAllester	
ID  - pmlr-v28-ross13b
PB  - PMLR
DP  - Proceedings of Machine Learning Research
VL  - 28
IS  - 3
SP  - 1364
EP  - 1372
L1  - http://proceedings.mlr.press/v28/ross13b.pdf
UR  - https://proceedings.mlr.press/v28/ross13b.html
AB  - Many prediction domains, such as ad placement, recommendation, trajectory prediction, and document summarization, require predicting a set or list of options. Such lists are often evaluated using submodular reward functions that measure both quality and diversity. We propose a simple, efficient, and provably near-optimal approach to optimizing such prediction problems based on no-regret learning. Our method leverages a surprising result from online submodular optimization: a single no-regret online learner can compete with an optimal sequence of predictions. Compared to previous work, which either learn a sequence of classifiers or rely on stronger assumptions such as realizability, we ensure both data-efficiency as well as performance guarantees in the fully agnostic setting. Experiments validate the efficiency and applicability of the approach on a wide range of problems including manipulator trajectory optimization, news recommendation and document summarization.
ER  -

APA


Ross, S., Zhou, J., Yue, Y., Dey, D. & Bagnell, D.. (2013). Learning Policies for Contextual Submodular Prediction. Proceedings of the 30th International Conference on Machine Learning, in Proceedings of Machine Learning Research 28(3):1364-1372 Available from https://proceedings.mlr.press/v28/ross13b.html.

Learning Policies for Contextual Submodular Prediction

Abstract

Cite this Paper

Related Material