Interactive Correlation Clustering with Existential Cluster Constraints

Rico Angell, Nicholas Monath, Nishant Yadav, Andrew Mccallum
Proceedings of the 39th International Conference on Machine Learning, PMLR 162:703-716, 2022.

Abstract

We consider the problem of clustering with user feedback. Existing methods express constraints about the input data points, most commonly through must-link and cannot-link constraints on data point pairs. In this paper, we introduce existential cluster constraints: a new form of feedback where users indicate the features of desired clusters. Specifically, users make statements about the existence of a cluster having (and not having) particular features. Our approach has multiple advantages: (1) constraints on clusters can express user intent more efficiently than point pairs; (2) in cases where the users’ mental model is of the desired clusters, it is more natural for users to express cluster-wise preferences; (3) it functions even when privacy restrictions prohibit users from seeing raw data. In addition to introducing existential cluster constraints, we provide an inference algorithm for incorporating our constraints into the output clustering. Finally, we demonstrate empirically that our proposed framework facilitates more accurate clustering with dramatically fewer user feedback inputs.

Cite this Paper


BibTeX
@InProceedings{pmlr-v162-angell22a, title = {Interactive Correlation Clustering with Existential Cluster Constraints}, author = {Angell, Rico and Monath, Nicholas and Yadav, Nishant and Mccallum, Andrew}, booktitle = {Proceedings of the 39th International Conference on Machine Learning}, pages = {703--716}, year = {2022}, editor = {Chaudhuri, Kamalika and Jegelka, Stefanie and Song, Le and Szepesvari, Csaba and Niu, Gang and Sabato, Sivan}, volume = {162}, series = {Proceedings of Machine Learning Research}, month = {17--23 Jul}, publisher = {PMLR}, pdf = {https://proceedings.mlr.press/v162/angell22a/angell22a.pdf}, url = {https://proceedings.mlr.press/v162/angell22a.html}, abstract = {We consider the problem of clustering with user feedback. Existing methods express constraints about the input data points, most commonly through must-link and cannot-link constraints on data point pairs. In this paper, we introduce existential cluster constraints: a new form of feedback where users indicate the features of desired clusters. Specifically, users make statements about the existence of a cluster having (and not having) particular features. Our approach has multiple advantages: (1) constraints on clusters can express user intent more efficiently than point pairs; (2) in cases where the users’ mental model is of the desired clusters, it is more natural for users to express cluster-wise preferences; (3) it functions even when privacy restrictions prohibit users from seeing raw data. In addition to introducing existential cluster constraints, we provide an inference algorithm for incorporating our constraints into the output clustering. Finally, we demonstrate empirically that our proposed framework facilitates more accurate clustering with dramatically fewer user feedback inputs.} }
Endnote
%0 Conference Paper %T Interactive Correlation Clustering with Existential Cluster Constraints %A Rico Angell %A Nicholas Monath %A Nishant Yadav %A Andrew Mccallum %B Proceedings of the 39th International Conference on Machine Learning %C Proceedings of Machine Learning Research %D 2022 %E Kamalika Chaudhuri %E Stefanie Jegelka %E Le Song %E Csaba Szepesvari %E Gang Niu %E Sivan Sabato %F pmlr-v162-angell22a %I PMLR %P 703--716 %U https://proceedings.mlr.press/v162/angell22a.html %V 162 %X We consider the problem of clustering with user feedback. Existing methods express constraints about the input data points, most commonly through must-link and cannot-link constraints on data point pairs. In this paper, we introduce existential cluster constraints: a new form of feedback where users indicate the features of desired clusters. Specifically, users make statements about the existence of a cluster having (and not having) particular features. Our approach has multiple advantages: (1) constraints on clusters can express user intent more efficiently than point pairs; (2) in cases where the users’ mental model is of the desired clusters, it is more natural for users to express cluster-wise preferences; (3) it functions even when privacy restrictions prohibit users from seeing raw data. In addition to introducing existential cluster constraints, we provide an inference algorithm for incorporating our constraints into the output clustering. Finally, we demonstrate empirically that our proposed framework facilitates more accurate clustering with dramatically fewer user feedback inputs.
APA
Angell, R., Monath, N., Yadav, N. & Mccallum, A.. (2022). Interactive Correlation Clustering with Existential Cluster Constraints. Proceedings of the 39th International Conference on Machine Learning, in Proceedings of Machine Learning Research 162:703-716 Available from https://proceedings.mlr.press/v162/angell22a.html.

Related Material