[edit]
Hierarchical Clustering: A 0.585 Revenue Approximation
Proceedings of Thirty Third Conference on Learning Theory, PMLR 125:153-162, 2020.
Abstract
Hierarchical Clustering trees have been widely accepted as a useful form of clustering data, resulting in a prevalence of adopting fields including phylogenetics, image analysis, bioinformatics and more. Recently, Dasgupta (STOC 16’) initiated the analysis of these types of algorithms through the lenses of approximation. Later, the dual problem was considered by Moseley and Wang (NIPS 17’) dubbing it the Revenue goal function. In this problem, given a nonnegative weight $w_{ij}$ for each pair $i,j \in [n]=\{1,2, \ldots ,n\}$, the objective is to find a tree $T$ whose set of leaves is $[n]$ that maximizes the function $\sum_{i