Interactive Bayesian Hierarchical Clustering


Sharad Vikram, Sanjoy Dasgupta ;
Proceedings of The 33rd International Conference on Machine Learning, PMLR 48:2081-2090, 2016.


Clustering is a powerful tool in data analysis, but it is often difficult to find a grouping that aligns with a user’s needs. To address this, several methods incorporate constraints obtained from users into clustering algorithms, but unfortunately do not apply to hierarchical clustering. We design an interactive Bayesian algorithm that incorporates user interaction into hierarchical clustering while still utilizing the geometry of the data by sampling a constrained posterior distribution over hierarchies. We also suggest several ways to intelligently query a user. The algorithm, along with the querying schemes, shows promising results on real data.

Related Material