Improved Approximation for Fair Correlation Clustering

Sara Ahmadian, Maryam Negahbani
Proceedings of The 26th International Conference on Artificial Intelligence and Statistics, PMLR 206:9499-9516, 2023.

Abstract

Correlation clustering is a ubiquitous paradigm in unsupervised machine learning where addressing unfairness is a major challenge. Motivated by this, we study fair correlation clustering where the data points may belong to different protected groups and the goal is to ensure fair representation of all groups across clusters. Our paper significantly generalizes and improves on the quality guarantees of previous work of Ahmadian et al. as follows. * We allow the user to specify an arbitrary upper bound on the representation of each group in a cluster. * Our algorithm allows individuals to have multiple protected features and ensure fairness simultaneously across them all. * We prove guarantees for clustering quality and fairness in this general setting. Furthermore, this improves on the results for the special cases studied in previous work. Our experiments on real-world data demonstrate that our clustering quality compared to the optimal solution is much better than what our theoretical result suggests.

Cite this Paper


BibTeX
@InProceedings{pmlr-v206-ahmadian23a, title = {Improved Approximation for Fair Correlation Clustering}, author = {Ahmadian, Sara and Negahbani, Maryam}, booktitle = {Proceedings of The 26th International Conference on Artificial Intelligence and Statistics}, pages = {9499--9516}, year = {2023}, editor = {Ruiz, Francisco and Dy, Jennifer and van de Meent, Jan-Willem}, volume = {206}, series = {Proceedings of Machine Learning Research}, month = {25--27 Apr}, publisher = {PMLR}, pdf = {https://proceedings.mlr.press/v206/ahmadian23a/ahmadian23a.pdf}, url = {https://proceedings.mlr.press/v206/ahmadian23a.html}, abstract = {Correlation clustering is a ubiquitous paradigm in unsupervised machine learning where addressing unfairness is a major challenge. Motivated by this, we study fair correlation clustering where the data points may belong to different protected groups and the goal is to ensure fair representation of all groups across clusters. Our paper significantly generalizes and improves on the quality guarantees of previous work of Ahmadian et al. as follows. * We allow the user to specify an arbitrary upper bound on the representation of each group in a cluster. * Our algorithm allows individuals to have multiple protected features and ensure fairness simultaneously across them all. * We prove guarantees for clustering quality and fairness in this general setting. Furthermore, this improves on the results for the special cases studied in previous work. Our experiments on real-world data demonstrate that our clustering quality compared to the optimal solution is much better than what our theoretical result suggests.} }
Endnote
%0 Conference Paper %T Improved Approximation for Fair Correlation Clustering %A Sara Ahmadian %A Maryam Negahbani %B Proceedings of The 26th International Conference on Artificial Intelligence and Statistics %C Proceedings of Machine Learning Research %D 2023 %E Francisco Ruiz %E Jennifer Dy %E Jan-Willem van de Meent %F pmlr-v206-ahmadian23a %I PMLR %P 9499--9516 %U https://proceedings.mlr.press/v206/ahmadian23a.html %V 206 %X Correlation clustering is a ubiquitous paradigm in unsupervised machine learning where addressing unfairness is a major challenge. Motivated by this, we study fair correlation clustering where the data points may belong to different protected groups and the goal is to ensure fair representation of all groups across clusters. Our paper significantly generalizes and improves on the quality guarantees of previous work of Ahmadian et al. as follows. * We allow the user to specify an arbitrary upper bound on the representation of each group in a cluster. * Our algorithm allows individuals to have multiple protected features and ensure fairness simultaneously across them all. * We prove guarantees for clustering quality and fairness in this general setting. Furthermore, this improves on the results for the special cases studied in previous work. Our experiments on real-world data demonstrate that our clustering quality compared to the optimal solution is much better than what our theoretical result suggests.
APA
Ahmadian, S. & Negahbani, M.. (2023). Improved Approximation for Fair Correlation Clustering. Proceedings of The 26th International Conference on Artificial Intelligence and Statistics, in Proceedings of Machine Learning Research 206:9499-9516 Available from https://proceedings.mlr.press/v206/ahmadian23a.html.

Related Material