Fair Soft Clustering

Rune D. Kjærsgaard, Pekka Parviainen, Saket Saurabh, Madhumita Kundu, Line Clemmensen
Proceedings of The 27th International Conference on Artificial Intelligence and Statistics, PMLR 238:1270-1278, 2024.

Abstract

Scholars in the machine learning community have recently focused on analyzing the fairness of learning models, including clustering algorithms. In this work we study fair clustering in a probabilistic (soft) setting, where observations may belong to several clusters determined by probabilities. We introduce new probabilistic fairness metrics, which generalize and extend existing non-probabilistic fairness frameworks and propose an algorithm for obtaining a fair probabilistic cluster solution from a data representation known as a fairlet decomposition. Finally, we demonstrate our proposed fairness metrics and algorithm by constructing a fair Gaussian mixture model on three real-world datasets. We achieve this by identifying balanced micro-clusters which minimize the distances induced by the model, and on which traditional clustering can be performed while ensuring the fairness of the solution.

Cite this Paper


BibTeX
@InProceedings{pmlr-v238-kjaersgaard24a, title = {Fair Soft Clustering}, author = {Kj{\ae}rsgaard, Rune D. and Parviainen, Pekka and Saurabh, Saket and Kundu, Madhumita and Clemmensen, Line}, booktitle = {Proceedings of The 27th International Conference on Artificial Intelligence and Statistics}, pages = {1270--1278}, year = {2024}, editor = {Dasgupta, Sanjoy and Mandt, Stephan and Li, Yingzhen}, volume = {238}, series = {Proceedings of Machine Learning Research}, month = {02--04 May}, publisher = {PMLR}, pdf = {https://proceedings.mlr.press/v238/kjaersgaard24a/kjaersgaard24a.pdf}, url = {https://proceedings.mlr.press/v238/kjaersgaard24a.html}, abstract = {Scholars in the machine learning community have recently focused on analyzing the fairness of learning models, including clustering algorithms. In this work we study fair clustering in a probabilistic (soft) setting, where observations may belong to several clusters determined by probabilities. We introduce new probabilistic fairness metrics, which generalize and extend existing non-probabilistic fairness frameworks and propose an algorithm for obtaining a fair probabilistic cluster solution from a data representation known as a fairlet decomposition. Finally, we demonstrate our proposed fairness metrics and algorithm by constructing a fair Gaussian mixture model on three real-world datasets. We achieve this by identifying balanced micro-clusters which minimize the distances induced by the model, and on which traditional clustering can be performed while ensuring the fairness of the solution.} }
Endnote
%0 Conference Paper %T Fair Soft Clustering %A Rune D. Kjærsgaard %A Pekka Parviainen %A Saket Saurabh %A Madhumita Kundu %A Line Clemmensen %B Proceedings of The 27th International Conference on Artificial Intelligence and Statistics %C Proceedings of Machine Learning Research %D 2024 %E Sanjoy Dasgupta %E Stephan Mandt %E Yingzhen Li %F pmlr-v238-kjaersgaard24a %I PMLR %P 1270--1278 %U https://proceedings.mlr.press/v238/kjaersgaard24a.html %V 238 %X Scholars in the machine learning community have recently focused on analyzing the fairness of learning models, including clustering algorithms. In this work we study fair clustering in a probabilistic (soft) setting, where observations may belong to several clusters determined by probabilities. We introduce new probabilistic fairness metrics, which generalize and extend existing non-probabilistic fairness frameworks and propose an algorithm for obtaining a fair probabilistic cluster solution from a data representation known as a fairlet decomposition. Finally, we demonstrate our proposed fairness metrics and algorithm by constructing a fair Gaussian mixture model on three real-world datasets. We achieve this by identifying balanced micro-clusters which minimize the distances induced by the model, and on which traditional clustering can be performed while ensuring the fairness of the solution.
APA
Kjærsgaard, R.D., Parviainen, P., Saurabh, S., Kundu, M. & Clemmensen, L.. (2024). Fair Soft Clustering. Proceedings of The 27th International Conference on Artificial Intelligence and Statistics, in Proceedings of Machine Learning Research 238:1270-1278 Available from https://proceedings.mlr.press/v238/kjaersgaard24a.html.

Related Material