[edit]
A Constant-Factor Approximation Algorithm for Reconciliation $k$-Median
Proceedings of The 26th International Conference on Artificial Intelligence and Statistics, PMLR 206:1719-1746, 2023.
Abstract
In the reconciliation $k$-median problem we ask to cluster a set of data points by picking $k$ cluster centers so as to minimize the sum of distances of the data points to their cluster centers plus the sum of pairwise distances between the centers. The problem, which is a variant of classic $k$-median, aims to find a set of cluster centers that are not too far from each other, and it has applications, or example, when selecting a committee to deliberate on a controversial topic. This problem was introduced recently (Ordozgoiti et al., 2019), and it was shown that a local-search-based algorithm is always within a factor $O(k)$ of an optimum solution and performs well in practice. In this paper, we demonstrate a close connection of reconciliation $k$-median to a variant of the $k$-facility location problem, in which each potential cluster center has an individual opening cost and we aim at minimizing the sum of client-center distances and the opening costs. This connection enables us to provide a new algorithm for reconciliation $k$-median that yields a constant-factor approximation (independent of $k$). We also provide a sparsification scheme that reduces the number of potential cluster centers to $O(k)$ in order to substantially speed up approximation algorithms. We empirically compare our new algorithms with the previous local-search approach, showing improved performance and stability. In addition, we show how our sparsification approach helps to reduce computation time without significantly compromising the solution quality.