[edit]
Refined Lower Bounds for Nearest Neighbor Condensation
Proceedings of The 33rd International Conference on Algorithmic Learning Theory, PMLR 167:262-281, 2022.
Abstract
One of the most commonly used classification techniques is the nearest neighbor rule: given a training set $T$ of labeled points in a metric space $(\mathcal{X},\rho)$, a new unlabeled point $x\in \mathcal{X}$ is assigned the label of its nearest neighbor in $T$. To improve both the space & time complexity of this classification, it is desirable to reduce the size of the training set without compromising too much on the accuracy of the classification. Hart (1968) formalized this as the \textsc{Nearest Neighbor Condensation} (NNC) problem: find a subset $C\subseteq T$ of minimum size which is \emph{consistent} with $T$, i.e., each point $t\in T$ has the same label as that of its nearest neighbor in $C$. This problem is known to be NP-hard (Wilfong, 1991), and the heuristics used in practice often have weak or no theoretical guarantees. We analyze this problem via the \emph{refined} lens of parameterized complexity, and obtain strong lower bounds for the $k$-\textsc{NNC}-$(\mathbb{Z}^{d},\ell_p)$ problem which asks if there is a consistent subset of size $\leq k$ for a given training set of size $n$ in the metric space $(\mathbb{Z}^d,\ell_p)$ for any $1\leq p\leq \infty$: \begin{itemize} \item The $k$-\textsc{NNC}-$(\mathbb{Z}^{d},\ell_p)$ problem is W[1]-hard parameterized by $k+d$, i.e., unless FPT = W[1], there is no $f(k,d)\cdot n^{O(1)}$ time algorithm for any computable function $f$. \item Under the Exponential Time Hypothesis (ETH), there is no $d\geq 2$ and computable function $f$ such that the $k$-\textsc{NNC}-$(\mathbb{Z}^{d},\ell_p)$ problem can be solved in $f(k,d)\cdot n^{o(k^{1-1/d})}$ time. \end{itemize} The second lower bound shows that there is a so-called (Marx and Sidiropoulos, 2014) “limited blessing of low-dimensionality”: for small $d$ some improvement \emph{might be} possible over the brute-force $n^{O(k)}$ time algorithm, but as $d$ becomes large the brute-force algorithm becomes asymptotically optimal. It also shows that the is the $n^{O(\sqrt{k})}$ time algorithm of Biniaz et al. (2019) for $k$-\textsc{NNC}-$(\mathbb{R}^{2},\ell_2)$ is asymptotically tight. Our lower bounds on the fine-grained complexity of \nnc in a sense justify the use of heuristics in practice, even though they have weak or no theoretical guarantees.