Submodular Streaming in All Its Glory: Tight Approximation, Minimum Memory and Low Adaptive Complexity

Ehsan Kazemi, Marko Mitrovic, Morteza Zadimoghaddam, Silvio Lattanzi, Amin Karbasi
Proceedings of the 36th International Conference on Machine Learning, PMLR 97:3311-3320, 2019.

Abstract

Streaming algorithms are generally judged by the quality of their solution, memory footprint, and computational complexity. In this paper, we study the problem of maximizing a monotone submodular function in the streaming setting with a cardinality constraint $k$. We first propose SIEVE-STREAMING++, which requires just one pass over the data, keeps only $O(k)$ elements and achieves the tight $\frac{1}{2}$-approximation guarantee. The best previously known streaming algorithms either achieve a suboptimal $\frac{1}{4}$-approximation with $\Theta(k)$ memory or the optimal $\frac{1}{2}$-approximation with $O(k\log k)$ memory. Next, we show that by buffering a small fraction of the stream and applying a careful filtering procedure, one can heavily reduce the number of adaptive computational rounds, thus substantially lowering the computational complexity of SIEVE-STREAMING++. We then generalize our results to the more challenging multi-source streaming setting. We show how one can achieve the tight $\frac{1}{2}$-approximation guarantee with $O(k)$ shared memory, while minimizing not only the rounds of computations but also the total number of communicated bits. Finally, we demonstrate the efficiency of our algorithms on real-world data summarization tasks for multi-source streams of tweets and of YouTube videos.

Cite this Paper


BibTeX
@InProceedings{pmlr-v97-kazemi19a, title = {Submodular Streaming in All Its Glory: Tight Approximation, Minimum Memory and Low Adaptive Complexity}, author = {Kazemi, Ehsan and Mitrovic, Marko and Zadimoghaddam, Morteza and Lattanzi, Silvio and Karbasi, Amin}, booktitle = {Proceedings of the 36th International Conference on Machine Learning}, pages = {3311--3320}, year = {2019}, editor = {Chaudhuri, Kamalika and Salakhutdinov, Ruslan}, volume = {97}, series = {Proceedings of Machine Learning Research}, month = {09--15 Jun}, publisher = {PMLR}, pdf = {http://proceedings.mlr.press/v97/kazemi19a/kazemi19a.pdf}, url = {https://proceedings.mlr.press/v97/kazemi19a.html}, abstract = {Streaming algorithms are generally judged by the quality of their solution, memory footprint, and computational complexity. In this paper, we study the problem of maximizing a monotone submodular function in the streaming setting with a cardinality constraint $k$. We first propose SIEVE-STREAMING++, which requires just one pass over the data, keeps only $O(k)$ elements and achieves the tight $\frac{1}{2}$-approximation guarantee. The best previously known streaming algorithms either achieve a suboptimal $\frac{1}{4}$-approximation with $\Theta(k)$ memory or the optimal $\frac{1}{2}$-approximation with $O(k\log k)$ memory. Next, we show that by buffering a small fraction of the stream and applying a careful filtering procedure, one can heavily reduce the number of adaptive computational rounds, thus substantially lowering the computational complexity of SIEVE-STREAMING++. We then generalize our results to the more challenging multi-source streaming setting. We show how one can achieve the tight $\frac{1}{2}$-approximation guarantee with $O(k)$ shared memory, while minimizing not only the rounds of computations but also the total number of communicated bits. Finally, we demonstrate the efficiency of our algorithms on real-world data summarization tasks for multi-source streams of tweets and of YouTube videos.} }
Endnote
%0 Conference Paper %T Submodular Streaming in All Its Glory: Tight Approximation, Minimum Memory and Low Adaptive Complexity %A Ehsan Kazemi %A Marko Mitrovic %A Morteza Zadimoghaddam %A Silvio Lattanzi %A Amin Karbasi %B Proceedings of the 36th International Conference on Machine Learning %C Proceedings of Machine Learning Research %D 2019 %E Kamalika Chaudhuri %E Ruslan Salakhutdinov %F pmlr-v97-kazemi19a %I PMLR %P 3311--3320 %U https://proceedings.mlr.press/v97/kazemi19a.html %V 97 %X Streaming algorithms are generally judged by the quality of their solution, memory footprint, and computational complexity. In this paper, we study the problem of maximizing a monotone submodular function in the streaming setting with a cardinality constraint $k$. We first propose SIEVE-STREAMING++, which requires just one pass over the data, keeps only $O(k)$ elements and achieves the tight $\frac{1}{2}$-approximation guarantee. The best previously known streaming algorithms either achieve a suboptimal $\frac{1}{4}$-approximation with $\Theta(k)$ memory or the optimal $\frac{1}{2}$-approximation with $O(k\log k)$ memory. Next, we show that by buffering a small fraction of the stream and applying a careful filtering procedure, one can heavily reduce the number of adaptive computational rounds, thus substantially lowering the computational complexity of SIEVE-STREAMING++. We then generalize our results to the more challenging multi-source streaming setting. We show how one can achieve the tight $\frac{1}{2}$-approximation guarantee with $O(k)$ shared memory, while minimizing not only the rounds of computations but also the total number of communicated bits. Finally, we demonstrate the efficiency of our algorithms on real-world data summarization tasks for multi-source streams of tweets and of YouTube videos.
APA
Kazemi, E., Mitrovic, M., Zadimoghaddam, M., Lattanzi, S. & Karbasi, A.. (2019). Submodular Streaming in All Its Glory: Tight Approximation, Minimum Memory and Low Adaptive Complexity. Proceedings of the 36th International Conference on Machine Learning, in Proceedings of Machine Learning Research 97:3311-3320 Available from https://proceedings.mlr.press/v97/kazemi19a.html.

Related Material