Deletion-Robust Submodular Maximization: Data Summarization with “the Right to be Forgotten”

Baharan Mirzasoleiman, Amin Karbasi, Andreas Krause
Proceedings of the 34th International Conference on Machine Learning, PMLR 70:2449-2458, 2017.

Abstract

How can we summarize a dynamic data stream when elements selected for the summary can be deleted at any time? This is an important challenge in online services, where the users generating the data may decide to exercise their right to restrict the service provider from using (part of) their data due to privacy concerns. Motivated by this challenge, we introduce the dynamic deletion-robust submodular maximization problem. We develop the first resilient streaming algorithm, called ROBUST-STREAMING, with a constant factor approximation guarantee to the optimum solution. We evaluate the effectiveness of our approach on several real-world applica tions, including summarizing (1) streams of geo-coordinates (2); streams of images; and (3) click-stream log data, consisting of 45 million feature vectors from a news recommendation task.

Cite this Paper


BibTeX
@InProceedings{pmlr-v70-mirzasoleiman17a, title = {Deletion-Robust Submodular Maximization: Data Summarization with ``the Right to be Forgotten''}, author = {Baharan Mirzasoleiman and Amin Karbasi and Andreas Krause}, booktitle = {Proceedings of the 34th International Conference on Machine Learning}, pages = {2449--2458}, year = {2017}, editor = {Precup, Doina and Teh, Yee Whye}, volume = {70}, series = {Proceedings of Machine Learning Research}, month = {06--11 Aug}, publisher = {PMLR}, pdf = {http://proceedings.mlr.press/v70/mirzasoleiman17a/mirzasoleiman17a.pdf}, url = {https://proceedings.mlr.press/v70/mirzasoleiman17a.html}, abstract = {How can we summarize a dynamic data stream when elements selected for the summary can be deleted at any time? This is an important challenge in online services, where the users generating the data may decide to exercise their right to restrict the service provider from using (part of) their data due to privacy concerns. Motivated by this challenge, we introduce the dynamic deletion-robust submodular maximization problem. We develop the first resilient streaming algorithm, called ROBUST-STREAMING, with a constant factor approximation guarantee to the optimum solution. We evaluate the effectiveness of our approach on several real-world applica tions, including summarizing (1) streams of geo-coordinates (2); streams of images; and (3) click-stream log data, consisting of 45 million feature vectors from a news recommendation task.} }
Endnote
%0 Conference Paper %T Deletion-Robust Submodular Maximization: Data Summarization with “the Right to be Forgotten” %A Baharan Mirzasoleiman %A Amin Karbasi %A Andreas Krause %B Proceedings of the 34th International Conference on Machine Learning %C Proceedings of Machine Learning Research %D 2017 %E Doina Precup %E Yee Whye Teh %F pmlr-v70-mirzasoleiman17a %I PMLR %P 2449--2458 %U https://proceedings.mlr.press/v70/mirzasoleiman17a.html %V 70 %X How can we summarize a dynamic data stream when elements selected for the summary can be deleted at any time? This is an important challenge in online services, where the users generating the data may decide to exercise their right to restrict the service provider from using (part of) their data due to privacy concerns. Motivated by this challenge, we introduce the dynamic deletion-robust submodular maximization problem. We develop the first resilient streaming algorithm, called ROBUST-STREAMING, with a constant factor approximation guarantee to the optimum solution. We evaluate the effectiveness of our approach on several real-world applica tions, including summarizing (1) streams of geo-coordinates (2); streams of images; and (3) click-stream log data, consisting of 45 million feature vectors from a news recommendation task.
APA
Mirzasoleiman, B., Karbasi, A. & Krause, A.. (2017). Deletion-Robust Submodular Maximization: Data Summarization with “the Right to be Forgotten”. Proceedings of the 34th International Conference on Machine Learning, in Proceedings of Machine Learning Research 70:2449-2458 Available from https://proceedings.mlr.press/v70/mirzasoleiman17a.html.

Related Material