Supervised Hierarchical Clustering with Exponential Linkage

Nishant Yadav, Ari Kobren, Nicholas Monath, Andrew Mccallum
Proceedings of the 36th International Conference on Machine Learning, PMLR 97:6973-6983, 2019.

Abstract

In supervised clustering, standard techniques for learning a pairwise dissimilarity function often suffer from a discrepancy between the training and clustering objectives, leading to poor cluster quality. Rectifying this discrepancy necessitates matching the procedure for training the dissimilarity function to the clustering algorithm. In this paper, we introduce a method for training the dissimilarity function in a way that is tightly coupled with hierarchical clustering, in particular single linkage. However, the appropriate clustering algorithm for a given dataset is often unknown. Thus we introduce an approach to supervised hierarchical clustering that smoothly interpolates between single, average, and complete linkage, and we give a training procedure that simultaneously learns a linkage function and a dissimilarity function. We accomplish this with a novel Exponential Linkage function that has a learnable parameter that controls the interpolation. In experiments on four datasets, our joint training procedure consistently matches or outperforms the next best training procedure/linkage function pair and gives up to 8 points improvement in dendrogram purity over discrepant pairs.

Cite this Paper


BibTeX
@InProceedings{pmlr-v97-yadav19a, title = {Supervised Hierarchical Clustering with Exponential Linkage}, author = {Yadav, Nishant and Kobren, Ari and Monath, Nicholas and Mccallum, Andrew}, booktitle = {Proceedings of the 36th International Conference on Machine Learning}, pages = {6973--6983}, year = {2019}, editor = {Chaudhuri, Kamalika and Salakhutdinov, Ruslan}, volume = {97}, series = {Proceedings of Machine Learning Research}, month = {09--15 Jun}, publisher = {PMLR}, pdf = {http://proceedings.mlr.press/v97/yadav19a/yadav19a.pdf}, url = {https://proceedings.mlr.press/v97/yadav19a.html}, abstract = {In supervised clustering, standard techniques for learning a pairwise dissimilarity function often suffer from a discrepancy between the training and clustering objectives, leading to poor cluster quality. Rectifying this discrepancy necessitates matching the procedure for training the dissimilarity function to the clustering algorithm. In this paper, we introduce a method for training the dissimilarity function in a way that is tightly coupled with hierarchical clustering, in particular single linkage. However, the appropriate clustering algorithm for a given dataset is often unknown. Thus we introduce an approach to supervised hierarchical clustering that smoothly interpolates between single, average, and complete linkage, and we give a training procedure that simultaneously learns a linkage function and a dissimilarity function. We accomplish this with a novel Exponential Linkage function that has a learnable parameter that controls the interpolation. In experiments on four datasets, our joint training procedure consistently matches or outperforms the next best training procedure/linkage function pair and gives up to 8 points improvement in dendrogram purity over discrepant pairs.} }
Endnote
%0 Conference Paper %T Supervised Hierarchical Clustering with Exponential Linkage %A Nishant Yadav %A Ari Kobren %A Nicholas Monath %A Andrew Mccallum %B Proceedings of the 36th International Conference on Machine Learning %C Proceedings of Machine Learning Research %D 2019 %E Kamalika Chaudhuri %E Ruslan Salakhutdinov %F pmlr-v97-yadav19a %I PMLR %P 6973--6983 %U https://proceedings.mlr.press/v97/yadav19a.html %V 97 %X In supervised clustering, standard techniques for learning a pairwise dissimilarity function often suffer from a discrepancy between the training and clustering objectives, leading to poor cluster quality. Rectifying this discrepancy necessitates matching the procedure for training the dissimilarity function to the clustering algorithm. In this paper, we introduce a method for training the dissimilarity function in a way that is tightly coupled with hierarchical clustering, in particular single linkage. However, the appropriate clustering algorithm for a given dataset is often unknown. Thus we introduce an approach to supervised hierarchical clustering that smoothly interpolates between single, average, and complete linkage, and we give a training procedure that simultaneously learns a linkage function and a dissimilarity function. We accomplish this with a novel Exponential Linkage function that has a learnable parameter that controls the interpolation. In experiments on four datasets, our joint training procedure consistently matches or outperforms the next best training procedure/linkage function pair and gives up to 8 points improvement in dendrogram purity over discrepant pairs.
APA
Yadav, N., Kobren, A., Monath, N. & Mccallum, A.. (2019). Supervised Hierarchical Clustering with Exponential Linkage. Proceedings of the 36th International Conference on Machine Learning, in Proceedings of Machine Learning Research 97:6973-6983 Available from https://proceedings.mlr.press/v97/yadav19a.html.

Related Material