Learning Probabilistic Submodular Diversity Models Via Noise Contrastive Estimation
Proceedings of the 19th International Conference on Artificial Intelligence and Statistics, PMLR 51:770-779, 2016.
Modeling diversity of sets of items is important in many applications such as product recommendation and data summarization. Probabilistic submodular models, a family of models including the determinantal point process, form a natural class of distributions, encouraging effects such as diversity, repulsion and coverage. Current models, however, are limited to small and medium number of items due to the high time complexity for learning and inference. In this paper, we propose FLID, a novel log-submodular diversity model that scales to large numbers of items and can be efficiently learned using noise contrastive estimation. We show that our model achieves state of the art performance in terms of model fit, but can be also learned orders of magnitude faster. We demonstrate the wide applicability of our model using several experiments.