Near-optimal Batch Mode Active Learning and Adaptive Submodular Optimization

Yuxin Chen, Andreas Krause
Proceedings of the 30th International Conference on Machine Learning, PMLR 28(1):160-168, 2013.

Abstract

Active learning can lead to a dramatic reduction in labeling effort. However, in many practical implementations (such as crowdsourcing, surveys, high-throughput experimental design), it is preferable to query labels for batches of examples to be labelled in parallel. While several heuristics have been proposed for batch-mode active learning, little is known about their theoretical performance. We consider batch mode active learning and more general information-parallel stochastic optimization problems that exhibit adaptive submodularity, a natural diminishing returns condition. We prove that for such problems, a simple greedy strategy is competitive with the optimal batch-mode policy. In some cases, surprisingly, the use of batches incurs competitively low cost, even when compared to a fully sequential strategy. We demonstrate the effectiveness of our approach on batch-mode active learning tasks, where it outperforms the state of the art, as well as the novel problem of multi-stage influence maximization in social networks.

Cite this Paper


BibTeX
@InProceedings{pmlr-v28-chen13b, title = {Near-optimal Batch Mode Active Learning and Adaptive Submodular Optimization}, author = {Chen, Yuxin and Krause, Andreas}, booktitle = {Proceedings of the 30th International Conference on Machine Learning}, pages = {160--168}, year = {2013}, editor = {Dasgupta, Sanjoy and McAllester, David}, volume = {28}, number = {1}, series = {Proceedings of Machine Learning Research}, address = {Atlanta, Georgia, USA}, month = {17--19 Jun}, publisher = {PMLR}, pdf = {http://proceedings.mlr.press/v28/chen13b.pdf}, url = {https://proceedings.mlr.press/v28/chen13b.html}, abstract = {Active learning can lead to a dramatic reduction in labeling effort. However, in many practical implementations (such as crowdsourcing, surveys, high-throughput experimental design), it is preferable to query labels for batches of examples to be labelled in parallel. While several heuristics have been proposed for batch-mode active learning, little is known about their theoretical performance. We consider batch mode active learning and more general information-parallel stochastic optimization problems that exhibit adaptive submodularity, a natural diminishing returns condition. We prove that for such problems, a simple greedy strategy is competitive with the optimal batch-mode policy. In some cases, surprisingly, the use of batches incurs competitively low cost, even when compared to a fully sequential strategy. We demonstrate the effectiveness of our approach on batch-mode active learning tasks, where it outperforms the state of the art, as well as the novel problem of multi-stage influence maximization in social networks.} }
Endnote
%0 Conference Paper %T Near-optimal Batch Mode Active Learning and Adaptive Submodular Optimization %A Yuxin Chen %A Andreas Krause %B Proceedings of the 30th International Conference on Machine Learning %C Proceedings of Machine Learning Research %D 2013 %E Sanjoy Dasgupta %E David McAllester %F pmlr-v28-chen13b %I PMLR %P 160--168 %U https://proceedings.mlr.press/v28/chen13b.html %V 28 %N 1 %X Active learning can lead to a dramatic reduction in labeling effort. However, in many practical implementations (such as crowdsourcing, surveys, high-throughput experimental design), it is preferable to query labels for batches of examples to be labelled in parallel. While several heuristics have been proposed for batch-mode active learning, little is known about their theoretical performance. We consider batch mode active learning and more general information-parallel stochastic optimization problems that exhibit adaptive submodularity, a natural diminishing returns condition. We prove that for such problems, a simple greedy strategy is competitive with the optimal batch-mode policy. In some cases, surprisingly, the use of batches incurs competitively low cost, even when compared to a fully sequential strategy. We demonstrate the effectiveness of our approach on batch-mode active learning tasks, where it outperforms the state of the art, as well as the novel problem of multi-stage influence maximization in social networks.
RIS
TY - CPAPER TI - Near-optimal Batch Mode Active Learning and Adaptive Submodular Optimization AU - Yuxin Chen AU - Andreas Krause BT - Proceedings of the 30th International Conference on Machine Learning DA - 2013/02/13 ED - Sanjoy Dasgupta ED - David McAllester ID - pmlr-v28-chen13b PB - PMLR DP - Proceedings of Machine Learning Research VL - 28 IS - 1 SP - 160 EP - 168 L1 - http://proceedings.mlr.press/v28/chen13b.pdf UR - https://proceedings.mlr.press/v28/chen13b.html AB - Active learning can lead to a dramatic reduction in labeling effort. However, in many practical implementations (such as crowdsourcing, surveys, high-throughput experimental design), it is preferable to query labels for batches of examples to be labelled in parallel. While several heuristics have been proposed for batch-mode active learning, little is known about their theoretical performance. We consider batch mode active learning and more general information-parallel stochastic optimization problems that exhibit adaptive submodularity, a natural diminishing returns condition. We prove that for such problems, a simple greedy strategy is competitive with the optimal batch-mode policy. In some cases, surprisingly, the use of batches incurs competitively low cost, even when compared to a fully sequential strategy. We demonstrate the effectiveness of our approach on batch-mode active learning tasks, where it outperforms the state of the art, as well as the novel problem of multi-stage influence maximization in social networks. ER -
APA
Chen, Y. & Krause, A.. (2013). Near-optimal Batch Mode Active Learning and Adaptive Submodular Optimization. Proceedings of the 30th International Conference on Machine Learning, in Proceedings of Machine Learning Research 28(1):160-168 Available from https://proceedings.mlr.press/v28/chen13b.html.

Related Material