Fast Constrained Submodular Maximization: Personalized Data Summarization

Baharan Mirzasoleiman, Ashwinkumar Badanidiyuru, Amin Karbasi
Proceedings of The 33rd International Conference on Machine Learning, PMLR 48:1358-1367, 2016.

Abstract

Can we summarize multi-category data based on user preferences in a scalable manner? Many utility functions used for data summarization satisfy submodularity, a natural diminishing returns property. We cast personalized data summarization as an instance of a general submodular maximization problem subject to multiple constraints. We develop the first practical and FAst coNsTrained submOdular Maximization algorithm, FANTOM, with strong theoretical guarantees. FANTOM maximizes a submodular function (not necessarily monotone) subject to intersection of a p-system and l knapsacks constrains. It achieves a (1 + ε)(p + 1)(2p + 2l + 1)/p approximation guarantee with only O(nrp log(n)/ε) query complexity (n and r indicate the size of the ground set and the size of the largest feasible solution, respectively). We then show how we can use FANTOM for personalized data summarization. In particular, a p-system can model different aspects of data, such as categories or time stamps, from which the users choose. In addition, knapsacks encode users’ constraints including budget or time. In our set of experiments, we consider several concrete applications: movie recommendation over 11K movies, personalized image summarization with 10K images, and revenue maximization on the YouTube social networks with 5000 communities. We observe that FANTOM constantly provides the highest utility against all the baselines.

Cite this Paper


BibTeX
@InProceedings{pmlr-v48-mirzasoleiman16, title = {Fast Constrained Submodular Maximization: Personalized Data Summarization}, author = {Mirzasoleiman, Baharan and Badanidiyuru, Ashwinkumar and Karbasi, Amin}, booktitle = {Proceedings of The 33rd International Conference on Machine Learning}, pages = {1358--1367}, year = {2016}, editor = {Balcan, Maria Florina and Weinberger, Kilian Q.}, volume = {48}, series = {Proceedings of Machine Learning Research}, address = {New York, New York, USA}, month = {20--22 Jun}, publisher = {PMLR}, pdf = {http://proceedings.mlr.press/v48/mirzasoleiman16.pdf}, url = { http://proceedings.mlr.press/v48/mirzasoleiman16.html }, abstract = {Can we summarize multi-category data based on user preferences in a scalable manner? Many utility functions used for data summarization satisfy submodularity, a natural diminishing returns property. We cast personalized data summarization as an instance of a general submodular maximization problem subject to multiple constraints. We develop the first practical and FAst coNsTrained submOdular Maximization algorithm, FANTOM, with strong theoretical guarantees. FANTOM maximizes a submodular function (not necessarily monotone) subject to intersection of a p-system and l knapsacks constrains. It achieves a (1 + ε)(p + 1)(2p + 2l + 1)/p approximation guarantee with only O(nrp log(n)/ε) query complexity (n and r indicate the size of the ground set and the size of the largest feasible solution, respectively). We then show how we can use FANTOM for personalized data summarization. In particular, a p-system can model different aspects of data, such as categories or time stamps, from which the users choose. In addition, knapsacks encode users’ constraints including budget or time. In our set of experiments, we consider several concrete applications: movie recommendation over 11K movies, personalized image summarization with 10K images, and revenue maximization on the YouTube social networks with 5000 communities. We observe that FANTOM constantly provides the highest utility against all the baselines.} }
Endnote
%0 Conference Paper %T Fast Constrained Submodular Maximization: Personalized Data Summarization %A Baharan Mirzasoleiman %A Ashwinkumar Badanidiyuru %A Amin Karbasi %B Proceedings of The 33rd International Conference on Machine Learning %C Proceedings of Machine Learning Research %D 2016 %E Maria Florina Balcan %E Kilian Q. Weinberger %F pmlr-v48-mirzasoleiman16 %I PMLR %P 1358--1367 %U http://proceedings.mlr.press/v48/mirzasoleiman16.html %V 48 %X Can we summarize multi-category data based on user preferences in a scalable manner? Many utility functions used for data summarization satisfy submodularity, a natural diminishing returns property. We cast personalized data summarization as an instance of a general submodular maximization problem subject to multiple constraints. We develop the first practical and FAst coNsTrained submOdular Maximization algorithm, FANTOM, with strong theoretical guarantees. FANTOM maximizes a submodular function (not necessarily monotone) subject to intersection of a p-system and l knapsacks constrains. It achieves a (1 + ε)(p + 1)(2p + 2l + 1)/p approximation guarantee with only O(nrp log(n)/ε) query complexity (n and r indicate the size of the ground set and the size of the largest feasible solution, respectively). We then show how we can use FANTOM for personalized data summarization. In particular, a p-system can model different aspects of data, such as categories or time stamps, from which the users choose. In addition, knapsacks encode users’ constraints including budget or time. In our set of experiments, we consider several concrete applications: movie recommendation over 11K movies, personalized image summarization with 10K images, and revenue maximization on the YouTube social networks with 5000 communities. We observe that FANTOM constantly provides the highest utility against all the baselines.
RIS
TY - CPAPER TI - Fast Constrained Submodular Maximization: Personalized Data Summarization AU - Baharan Mirzasoleiman AU - Ashwinkumar Badanidiyuru AU - Amin Karbasi BT - Proceedings of The 33rd International Conference on Machine Learning DA - 2016/06/11 ED - Maria Florina Balcan ED - Kilian Q. Weinberger ID - pmlr-v48-mirzasoleiman16 PB - PMLR DP - Proceedings of Machine Learning Research VL - 48 SP - 1358 EP - 1367 L1 - http://proceedings.mlr.press/v48/mirzasoleiman16.pdf UR - http://proceedings.mlr.press/v48/mirzasoleiman16.html AB - Can we summarize multi-category data based on user preferences in a scalable manner? Many utility functions used for data summarization satisfy submodularity, a natural diminishing returns property. We cast personalized data summarization as an instance of a general submodular maximization problem subject to multiple constraints. We develop the first practical and FAst coNsTrained submOdular Maximization algorithm, FANTOM, with strong theoretical guarantees. FANTOM maximizes a submodular function (not necessarily monotone) subject to intersection of a p-system and l knapsacks constrains. It achieves a (1 + ε)(p + 1)(2p + 2l + 1)/p approximation guarantee with only O(nrp log(n)/ε) query complexity (n and r indicate the size of the ground set and the size of the largest feasible solution, respectively). We then show how we can use FANTOM for personalized data summarization. In particular, a p-system can model different aspects of data, such as categories or time stamps, from which the users choose. In addition, knapsacks encode users’ constraints including budget or time. In our set of experiments, we consider several concrete applications: movie recommendation over 11K movies, personalized image summarization with 10K images, and revenue maximization on the YouTube social networks with 5000 communities. We observe that FANTOM constantly provides the highest utility against all the baselines. ER -
APA
Mirzasoleiman, B., Badanidiyuru, A. & Karbasi, A.. (2016). Fast Constrained Submodular Maximization: Personalized Data Summarization. Proceedings of The 33rd International Conference on Machine Learning, in Proceedings of Machine Learning Research 48:1358-1367 Available from http://proceedings.mlr.press/v48/mirzasoleiman16.html .

Related Material