Correlation Clustering with Asymmetric Classification Errors

Jafar Jafarov, Sanchit Kalhan, Konstantin Makarychev, Yury Makarychev
Proceedings of the 37th International Conference on Machine Learning, PMLR 119:4641-4650, 2020.

Abstract

In the Correlation Clustering problem, we are given a weighted graph $G$ with its edges labelled as "similar" or "dissimilar" by a binary classifier. The goal is to produce a clustering that minimizes the weight of "disagreements": the sum of the weights of "similar" edges across clusters and "dissimilar" edges within clusters. We study the correlation clustering problem under the following assumption: Every "similar" edge $e$ has weight $w_e \in [ \alpha w, w ]$ and every "dissimilar" edge $e$ has weight $w_e \geq \alpha w$ (where $\alpha \leq 1$ and $w > 0$ is a scaling parameter). We give a $(3 + 2 \log_e (1/\alpha))$ approximation algorithm for this problem. This assumption captures well the scenario when classification errors are asymmetric. Additionally, we show an asymptotically matching Linear Programming integrality gap of $\Omega(\log 1/\alpha)$.

Cite this Paper


BibTeX
@InProceedings{pmlr-v119-jafarov20a, title = {Correlation Clustering with Asymmetric Classification Errors}, author = {Jafarov, Jafar and Kalhan, Sanchit and Makarychev, Konstantin and Makarychev, Yury}, booktitle = {Proceedings of the 37th International Conference on Machine Learning}, pages = {4641--4650}, year = {2020}, editor = {III, Hal Daumé and Singh, Aarti}, volume = {119}, series = {Proceedings of Machine Learning Research}, month = {13--18 Jul}, publisher = {PMLR}, pdf = {http://proceedings.mlr.press/v119/jafarov20a/jafarov20a.pdf}, url = {http://proceedings.mlr.press/v119/jafarov20a.html}, abstract = {In the Correlation Clustering problem, we are given a weighted graph $G$ with its edges labelled as "similar" or "dissimilar" by a binary classifier. The goal is to produce a clustering that minimizes the weight of "disagreements": the sum of the weights of "similar" edges across clusters and "dissimilar" edges within clusters. We study the correlation clustering problem under the following assumption: Every "similar" edge $e$ has weight $w_e \in [ \alpha w, w ]$ and every "dissimilar" edge $e$ has weight $w_e \geq \alpha w$ (where $\alpha \leq 1$ and $w > 0$ is a scaling parameter). We give a $(3 + 2 \log_e (1/\alpha))$ approximation algorithm for this problem. This assumption captures well the scenario when classification errors are asymmetric. Additionally, we show an asymptotically matching Linear Programming integrality gap of $\Omega(\log 1/\alpha)$.} }
Endnote
%0 Conference Paper %T Correlation Clustering with Asymmetric Classification Errors %A Jafar Jafarov %A Sanchit Kalhan %A Konstantin Makarychev %A Yury Makarychev %B Proceedings of the 37th International Conference on Machine Learning %C Proceedings of Machine Learning Research %D 2020 %E Hal Daumé III %E Aarti Singh %F pmlr-v119-jafarov20a %I PMLR %P 4641--4650 %U http://proceedings.mlr.press/v119/jafarov20a.html %V 119 %X In the Correlation Clustering problem, we are given a weighted graph $G$ with its edges labelled as "similar" or "dissimilar" by a binary classifier. The goal is to produce a clustering that minimizes the weight of "disagreements": the sum of the weights of "similar" edges across clusters and "dissimilar" edges within clusters. We study the correlation clustering problem under the following assumption: Every "similar" edge $e$ has weight $w_e \in [ \alpha w, w ]$ and every "dissimilar" edge $e$ has weight $w_e \geq \alpha w$ (where $\alpha \leq 1$ and $w > 0$ is a scaling parameter). We give a $(3 + 2 \log_e (1/\alpha))$ approximation algorithm for this problem. This assumption captures well the scenario when classification errors are asymmetric. Additionally, we show an asymptotically matching Linear Programming integrality gap of $\Omega(\log 1/\alpha)$.
APA
Jafarov, J., Kalhan, S., Makarychev, K. & Makarychev, Y.. (2020). Correlation Clustering with Asymmetric Classification Errors. Proceedings of the 37th International Conference on Machine Learning, in Proceedings of Machine Learning Research 119:4641-4650 Available from http://proceedings.mlr.press/v119/jafarov20a.html.

Related Material