Maximizing Agreements for Ranking, Clustering and Hierarchical Clustering via MAX-CUT

Vaggos Chatziafratis, Mohammad Mahdian, Sara Ahmadian
Proceedings of The 24th International Conference on Artificial Intelligence and Statistics, PMLR 130:1657-1665, 2021.

Abstract

In this paper, we study a number of well-known combinatorial optimization problems that fit in the following paradigm: the input is a collection of (potentially inconsistent) local relationships between the elements of a ground set (e.g., pairwise comparisons, similar/dissimilar pairs, or ancestry structure of triples of points), and the goal is to aggregate this information into a global structure (e.g., a ranking, a clustering, or a hierarchical clustering) in a way that maximizes agreement with the input. Well-studied problems such as rank aggregation, correlation clustering, and hierarchical clustering with triplet constraints fall in this class of problems. We study these problems on stochastic instances with a hidden embedded ground truth solution. Our main algorithmic contribution is a unified technique that uses the maximum cut problem in graphs to approximately solve these problems. Using this technique, we can often get approximation guarantees in the stochastic setting that are better than the known worst case inapproximability bounds for the corresponding problem. On the negative side, we improve the worst case inapproximability bound on several hierarchical clustering formulations through a reduction to related ranking problems.

Cite this Paper


BibTeX
@InProceedings{pmlr-v130-chatziafratis21a, title = { Maximizing Agreements for Ranking, Clustering and Hierarchical Clustering via MAX-CUT }, author = {Chatziafratis, Vaggos and Mahdian, Mohammad and Ahmadian, Sara}, booktitle = {Proceedings of The 24th International Conference on Artificial Intelligence and Statistics}, pages = {1657--1665}, year = {2021}, editor = {Banerjee, Arindam and Fukumizu, Kenji}, volume = {130}, series = {Proceedings of Machine Learning Research}, month = {13--15 Apr}, publisher = {PMLR}, pdf = {http://proceedings.mlr.press/v130/chatziafratis21a/chatziafratis21a.pdf}, url = {https://proceedings.mlr.press/v130/chatziafratis21a.html}, abstract = { In this paper, we study a number of well-known combinatorial optimization problems that fit in the following paradigm: the input is a collection of (potentially inconsistent) local relationships between the elements of a ground set (e.g., pairwise comparisons, similar/dissimilar pairs, or ancestry structure of triples of points), and the goal is to aggregate this information into a global structure (e.g., a ranking, a clustering, or a hierarchical clustering) in a way that maximizes agreement with the input. Well-studied problems such as rank aggregation, correlation clustering, and hierarchical clustering with triplet constraints fall in this class of problems. We study these problems on stochastic instances with a hidden embedded ground truth solution. Our main algorithmic contribution is a unified technique that uses the maximum cut problem in graphs to approximately solve these problems. Using this technique, we can often get approximation guarantees in the stochastic setting that are better than the known worst case inapproximability bounds for the corresponding problem. On the negative side, we improve the worst case inapproximability bound on several hierarchical clustering formulations through a reduction to related ranking problems. } }
Endnote
%0 Conference Paper %T Maximizing Agreements for Ranking, Clustering and Hierarchical Clustering via MAX-CUT %A Vaggos Chatziafratis %A Mohammad Mahdian %A Sara Ahmadian %B Proceedings of The 24th International Conference on Artificial Intelligence and Statistics %C Proceedings of Machine Learning Research %D 2021 %E Arindam Banerjee %E Kenji Fukumizu %F pmlr-v130-chatziafratis21a %I PMLR %P 1657--1665 %U https://proceedings.mlr.press/v130/chatziafratis21a.html %V 130 %X In this paper, we study a number of well-known combinatorial optimization problems that fit in the following paradigm: the input is a collection of (potentially inconsistent) local relationships between the elements of a ground set (e.g., pairwise comparisons, similar/dissimilar pairs, or ancestry structure of triples of points), and the goal is to aggregate this information into a global structure (e.g., a ranking, a clustering, or a hierarchical clustering) in a way that maximizes agreement with the input. Well-studied problems such as rank aggregation, correlation clustering, and hierarchical clustering with triplet constraints fall in this class of problems. We study these problems on stochastic instances with a hidden embedded ground truth solution. Our main algorithmic contribution is a unified technique that uses the maximum cut problem in graphs to approximately solve these problems. Using this technique, we can often get approximation guarantees in the stochastic setting that are better than the known worst case inapproximability bounds for the corresponding problem. On the negative side, we improve the worst case inapproximability bound on several hierarchical clustering formulations through a reduction to related ranking problems.
APA
Chatziafratis, V., Mahdian, M. & Ahmadian, S.. (2021). Maximizing Agreements for Ranking, Clustering and Hierarchical Clustering via MAX-CUT . Proceedings of The 24th International Conference on Artificial Intelligence and Statistics, in Proceedings of Machine Learning Research 130:1657-1665 Available from https://proceedings.mlr.press/v130/chatziafratis21a.html.

Related Material