Submodular Streaming in All Its Glory: Tight Approximation, Minimum Memory and Low Adaptive Complexity

Ehsan Kazemi, Marko Mitrovic, Morteza Zadimoghaddam, Silvio Lattanzi, Amin Karbasi
Proceedings of the 36th International Conference on Machine Learning, PMLR 97:3311-3320, 2019.

Abstract

Streaming algorithms are generally judged by the quality of their solution, memory footprint, and computational complexity. In this paper, we study the problem of maximizing a monotone submodular function in the streaming setting with a cardinality constraint k. We first propose SIEVE-STREAMING++, which requires just one pass over the data, keeps only O(k) elements and achieves the tight 12-approximation guarantee. The best previously known streaming algorithms either achieve a suboptimal 14-approximation with Θ(k) memory or the optimal 12-approximation with O(klogk) memory. Next, we show that by buffering a small fraction of the stream and applying a careful filtering procedure, one can heavily reduce the number of adaptive computational rounds, thus substantially lowering the computational complexity of SIEVE-STREAMING++. We then generalize our results to the more challenging multi-source streaming setting. We show how one can achieve the tight 12-approximation guarantee with O(k) shared memory, while minimizing not only the rounds of computations but also the total number of communicated bits. Finally, we demonstrate the efficiency of our algorithms on real-world data summarization tasks for multi-source streams of tweets and of YouTube videos.

Cite this Paper


BibTeX
@InProceedings{pmlr-v97-kazemi19a, title = {Submodular Streaming in All Its Glory: Tight Approximation, Minimum Memory and Low Adaptive Complexity}, author = {Kazemi, Ehsan and Mitrovic, Marko and Zadimoghaddam, Morteza and Lattanzi, Silvio and Karbasi, Amin}, booktitle = {Proceedings of the 36th International Conference on Machine Learning}, pages = {3311--3320}, year = {2019}, editor = {Chaudhuri, Kamalika and Salakhutdinov, Ruslan}, volume = {97}, series = {Proceedings of Machine Learning Research}, month = {09--15 Jun}, publisher = {PMLR}, pdf = {http://proceedings.mlr.press/v97/kazemi19a/kazemi19a.pdf}, url = {https://proceedings.mlr.press/v97/kazemi19a.html}, abstract = {Streaming algorithms are generally judged by the quality of their solution, memory footprint, and computational complexity. In this paper, we study the problem of maximizing a monotone submodular function in the streaming setting with a cardinality constraint $k$. We first propose SIEVE-STREAMING++, which requires just one pass over the data, keeps only $O(k)$ elements and achieves the tight $\frac{1}{2}$-approximation guarantee. The best previously known streaming algorithms either achieve a suboptimal $\frac{1}{4}$-approximation with $\Theta(k)$ memory or the optimal $\frac{1}{2}$-approximation with $O(k\log k)$ memory. Next, we show that by buffering a small fraction of the stream and applying a careful filtering procedure, one can heavily reduce the number of adaptive computational rounds, thus substantially lowering the computational complexity of SIEVE-STREAMING++. We then generalize our results to the more challenging multi-source streaming setting. We show how one can achieve the tight $\frac{1}{2}$-approximation guarantee with $O(k)$ shared memory, while minimizing not only the rounds of computations but also the total number of communicated bits. Finally, we demonstrate the efficiency of our algorithms on real-world data summarization tasks for multi-source streams of tweets and of YouTube videos.} }
Endnote
%0 Conference Paper %T Submodular Streaming in All Its Glory: Tight Approximation, Minimum Memory and Low Adaptive Complexity %A Ehsan Kazemi %A Marko Mitrovic %A Morteza Zadimoghaddam %A Silvio Lattanzi %A Amin Karbasi %B Proceedings of the 36th International Conference on Machine Learning %C Proceedings of Machine Learning Research %D 2019 %E Kamalika Chaudhuri %E Ruslan Salakhutdinov %F pmlr-v97-kazemi19a %I PMLR %P 3311--3320 %U https://proceedings.mlr.press/v97/kazemi19a.html %V 97 %X Streaming algorithms are generally judged by the quality of their solution, memory footprint, and computational complexity. In this paper, we study the problem of maximizing a monotone submodular function in the streaming setting with a cardinality constraint $k$. We first propose SIEVE-STREAMING++, which requires just one pass over the data, keeps only $O(k)$ elements and achieves the tight $\frac{1}{2}$-approximation guarantee. The best previously known streaming algorithms either achieve a suboptimal $\frac{1}{4}$-approximation with $\Theta(k)$ memory or the optimal $\frac{1}{2}$-approximation with $O(k\log k)$ memory. Next, we show that by buffering a small fraction of the stream and applying a careful filtering procedure, one can heavily reduce the number of adaptive computational rounds, thus substantially lowering the computational complexity of SIEVE-STREAMING++. We then generalize our results to the more challenging multi-source streaming setting. We show how one can achieve the tight $\frac{1}{2}$-approximation guarantee with $O(k)$ shared memory, while minimizing not only the rounds of computations but also the total number of communicated bits. Finally, we demonstrate the efficiency of our algorithms on real-world data summarization tasks for multi-source streams of tweets and of YouTube videos.
APA
Kazemi, E., Mitrovic, M., Zadimoghaddam, M., Lattanzi, S. & Karbasi, A.. (2019). Submodular Streaming in All Its Glory: Tight Approximation, Minimum Memory and Low Adaptive Complexity. Proceedings of the 36th International Conference on Machine Learning, in Proceedings of Machine Learning Research 97:3311-3320 Available from https://proceedings.mlr.press/v97/kazemi19a.html.

Related Material