A Pairwise Fair and Community-preserving Approach to k-Center Clustering

Brian Brubach, Darshan Chakrabarti, John Dickerson, Samir Khuller, Aravind Srinivasan, Leonidas Tsepenekas
Proceedings of the 37th International Conference on Machine Learning, PMLR 119:1178-1189, 2020.

Abstract

Clustering is a foundational problem in machine learning with numerous applications. As machine learning increases in ubiquity as a backend for automated systems, concerns about fairness arise. Much of the current literature on fairness deals with discrimination against protected classes in supervised learning (group fairness). We define a different notion of fair clustering wherein the probability that two points (or a community of points) become separated is bounded by an increasing function of their pairwise distance (or community diameter). We capture the situation where data points represent people who gain some benefit from being clustered together. Unfairness arises when certain points are deterministically separated, either arbitrarily or by someone who intends to harm them as in the case of gerrymandering election districts. In response, we formally define two new types of fairness in the clustering setting, pairwise fairness and community preservation. To explore the practicality of our fairness goals, we devise an approach for extending existing $k$-center algorithms to satisfy these fairness constraints. Analysis of this approach proves that reasonable approximations can be achieved while maintaining fairness. In experiments, we compare the effectiveness of our approach to classical $k$-center algorithms/heuristics and explore the tradeoff between optimal clustering and fairness.

Cite this Paper


BibTeX
@InProceedings{pmlr-v119-brubach20a, title = {A Pairwise Fair and Community-preserving Approach to k-Center Clustering}, author = {Brubach, Brian and Chakrabarti, Darshan and Dickerson, John and Khuller, Samir and Srinivasan, Aravind and Tsepenekas, Leonidas}, booktitle = {Proceedings of the 37th International Conference on Machine Learning}, pages = {1178--1189}, year = {2020}, editor = {Hal Daumé III and Aarti Singh}, volume = {119}, series = {Proceedings of Machine Learning Research}, month = {13--18 Jul}, publisher = {PMLR}, pdf = {http://proceedings.mlr.press/v119/brubach20a/brubach20a.pdf}, url = { http://proceedings.mlr.press/v119/brubach20a.html }, abstract = {Clustering is a foundational problem in machine learning with numerous applications. As machine learning increases in ubiquity as a backend for automated systems, concerns about fairness arise. Much of the current literature on fairness deals with discrimination against protected classes in supervised learning (group fairness). We define a different notion of fair clustering wherein the probability that two points (or a community of points) become separated is bounded by an increasing function of their pairwise distance (or community diameter). We capture the situation where data points represent people who gain some benefit from being clustered together. Unfairness arises when certain points are deterministically separated, either arbitrarily or by someone who intends to harm them as in the case of gerrymandering election districts. In response, we formally define two new types of fairness in the clustering setting, pairwise fairness and community preservation. To explore the practicality of our fairness goals, we devise an approach for extending existing $k$-center algorithms to satisfy these fairness constraints. Analysis of this approach proves that reasonable approximations can be achieved while maintaining fairness. In experiments, we compare the effectiveness of our approach to classical $k$-center algorithms/heuristics and explore the tradeoff between optimal clustering and fairness.} }
Endnote
%0 Conference Paper %T A Pairwise Fair and Community-preserving Approach to k-Center Clustering %A Brian Brubach %A Darshan Chakrabarti %A John Dickerson %A Samir Khuller %A Aravind Srinivasan %A Leonidas Tsepenekas %B Proceedings of the 37th International Conference on Machine Learning %C Proceedings of Machine Learning Research %D 2020 %E Hal Daumé III %E Aarti Singh %F pmlr-v119-brubach20a %I PMLR %P 1178--1189 %U http://proceedings.mlr.press/v119/brubach20a.html %V 119 %X Clustering is a foundational problem in machine learning with numerous applications. As machine learning increases in ubiquity as a backend for automated systems, concerns about fairness arise. Much of the current literature on fairness deals with discrimination against protected classes in supervised learning (group fairness). We define a different notion of fair clustering wherein the probability that two points (or a community of points) become separated is bounded by an increasing function of their pairwise distance (or community diameter). We capture the situation where data points represent people who gain some benefit from being clustered together. Unfairness arises when certain points are deterministically separated, either arbitrarily or by someone who intends to harm them as in the case of gerrymandering election districts. In response, we formally define two new types of fairness in the clustering setting, pairwise fairness and community preservation. To explore the practicality of our fairness goals, we devise an approach for extending existing $k$-center algorithms to satisfy these fairness constraints. Analysis of this approach proves that reasonable approximations can be achieved while maintaining fairness. In experiments, we compare the effectiveness of our approach to classical $k$-center algorithms/heuristics and explore the tradeoff between optimal clustering and fairness.
APA
Brubach, B., Chakrabarti, D., Dickerson, J., Khuller, S., Srinivasan, A. & Tsepenekas, L.. (2020). A Pairwise Fair and Community-preserving Approach to k-Center Clustering. Proceedings of the 37th International Conference on Machine Learning, in Proceedings of Machine Learning Research 119:1178-1189 Available from http://proceedings.mlr.press/v119/brubach20a.html .

Related Material