[edit]

# Fast Dynamic Sampling for Determinantal Point Processes

*Proceedings of The 27th International Conference on Artificial Intelligence and Statistics*, PMLR 238:244-252, 2024.

#### Abstract

n this work, we provide fast dynamic algorithms for repeatedly sampling from distributions characterized by Determinantal Point Processes (DPPs) and Nonsymmetric Determinantal Point Processes (NDPPs). DPPs are a very well-studied class of distributions on subsets of items drawn from a ground set of cardinality $n$ characterized by a symmetric $n \times n$ kernel matrix $L$ such that the probability of any subset is proportional to the determinant of its corresponding principal submatrix. Recent work has shown that the kernel symmetry constraint can be relaxed, leading to NDPPs, which can better model data in several machine learning applications. Given a low-rank kernel matrix ${\cal L}=L+L^\top\in \mathbb{R}^{n\times n}$ and its corresponding eigendecomposition specified by $\{\lambda_i, u_i \}_{i=1}^d$ where $d\leq n$ is the rank, we design a data structure that uses $O(nd)$ space and preprocesses data in $O(nd^{\omega-1})$ time where $\omega\approx 2.37$ is the exponent of matrix multiplication. The data structure can generate a sample according to DPP distribution in time $O(|E|^3\log n+|E|^{\omega-1}d^2)$ or according to NDPP distribution in time $O((|E|^3 \log n+ |E|^{\omega-1}d^2)(1+w)^d)$ for $E$ being the sampled indices and $w$ is a data-dependent parameter. This improves upon the space and preprocessing time over prior works, and achieves a state-of-the-art sampling time when the sampling set is relatively dense. At the heart of our data structure is an efficient sampling tree that can leverage batch initialization and fast inner product query simultaneously.