Cost-Free Fairness in Online Correlation Clustering

Eric Balkanski, Jason Chatzitheodorou, Andreas Maggiori
Proceedings of The 36th International Conference on Algorithmic Learning Theory, PMLR 272:167-203, 2025.

Abstract

In the correlation clustering problem, the input is a signed graph where the sign indicates whether pairs of nodes should be placed in the same cluster or not. The goal is to create a clustering that minimizes the number of disagreements with these signs. Correlation clustering is a key unsupervised learning problem with many practical applications. It has been widely studied in various settings, including versions with fairness constraints and cases where nodes arrive online. In this paper, we explore a problem that combines these two settings: nodes arrive online and reveal their membership in protected groups upon arrival. We are only allowed to output fair clusters, i.e., clusters where the representation of each protected group is upper bounded by a user-specified constant at the beginning of the arrival sequence. Our aim is to maintain approximately optimal fair clustering while minimizing a node’s worst-case recourse, i.e., the number of times it changes clusters. We present an algorithm that achieves worst-case logarithmic recourse per node while maintaining a constant-factor fair approximate clustering. Additionally, our approach simplifies the algorithm and analysis used in prior work by Cohen-Addad et al. (2022) in the online setting with recourse.

Cite this Paper


BibTeX
@InProceedings{pmlr-v272-balkanski25b, title = {Cost-Free Fairness in Online Correlation Clustering}, author = {Balkanski, Eric and Chatzitheodorou, Jason and Maggiori, Andreas}, booktitle = {Proceedings of The 36th International Conference on Algorithmic Learning Theory}, pages = {167--203}, year = {2025}, editor = {Kamath, Gautam and Loh, Po-Ling}, volume = {272}, series = {Proceedings of Machine Learning Research}, month = {24--27 Feb}, publisher = {PMLR}, pdf = {https://raw.githubusercontent.com/mlresearch/v272/main/assets/balkanski25b/balkanski25b.pdf}, url = {https://proceedings.mlr.press/v272/balkanski25b.html}, abstract = {In the correlation clustering problem, the input is a signed graph where the sign indicates whether pairs of nodes should be placed in the same cluster or not. The goal is to create a clustering that minimizes the number of disagreements with these signs. Correlation clustering is a key unsupervised learning problem with many practical applications. It has been widely studied in various settings, including versions with fairness constraints and cases where nodes arrive online. In this paper, we explore a problem that combines these two settings: nodes arrive online and reveal their membership in protected groups upon arrival. We are only allowed to output fair clusters, i.e., clusters where the representation of each protected group is upper bounded by a user-specified constant at the beginning of the arrival sequence. Our aim is to maintain approximately optimal fair clustering while minimizing a node’s worst-case recourse, i.e., the number of times it changes clusters. We present an algorithm that achieves worst-case logarithmic recourse per node while maintaining a constant-factor fair approximate clustering. Additionally, our approach simplifies the algorithm and analysis used in prior work by Cohen-Addad et al. (2022) in the online setting with recourse.} }
Endnote
%0 Conference Paper %T Cost-Free Fairness in Online Correlation Clustering %A Eric Balkanski %A Jason Chatzitheodorou %A Andreas Maggiori %B Proceedings of The 36th International Conference on Algorithmic Learning Theory %C Proceedings of Machine Learning Research %D 2025 %E Gautam Kamath %E Po-Ling Loh %F pmlr-v272-balkanski25b %I PMLR %P 167--203 %U https://proceedings.mlr.press/v272/balkanski25b.html %V 272 %X In the correlation clustering problem, the input is a signed graph where the sign indicates whether pairs of nodes should be placed in the same cluster or not. The goal is to create a clustering that minimizes the number of disagreements with these signs. Correlation clustering is a key unsupervised learning problem with many practical applications. It has been widely studied in various settings, including versions with fairness constraints and cases where nodes arrive online. In this paper, we explore a problem that combines these two settings: nodes arrive online and reveal their membership in protected groups upon arrival. We are only allowed to output fair clusters, i.e., clusters where the representation of each protected group is upper bounded by a user-specified constant at the beginning of the arrival sequence. Our aim is to maintain approximately optimal fair clustering while minimizing a node’s worst-case recourse, i.e., the number of times it changes clusters. We present an algorithm that achieves worst-case logarithmic recourse per node while maintaining a constant-factor fair approximate clustering. Additionally, our approach simplifies the algorithm and analysis used in prior work by Cohen-Addad et al. (2022) in the online setting with recourse.
APA
Balkanski, E., Chatzitheodorou, J. & Maggiori, A.. (2025). Cost-Free Fairness in Online Correlation Clustering. Proceedings of The 36th International Conference on Algorithmic Learning Theory, in Proceedings of Machine Learning Research 272:167-203 Available from https://proceedings.mlr.press/v272/balkanski25b.html.

Related Material