Clustering Oligarchies

Margareta Ackerman, Shai Ben-David, David Loker, Sivan Sabato
Proceedings of the Sixteenth International Conference on Artificial Intelligence and Statistics, PMLR 31:66-74, 2013.

Abstract

We investigate the extent to which clustering algorithms are robust to the addition of a small, potentially adversarial, set of points. Our analysis reveals radical differences in the robustness of popular clustering methods. k-means and several related techniques are robust when data is clusterable, and we provide a quantitative analysis capturing the precise relationship between clusterability and robustness. In contrast, common linkage-based algorithms and several standard objective-function-based clustering methods can be highly sensitive to the addition of a small set of points even when the data is highly clusterable. We call such sets of points oligarchies. Lastly, we show that the behavior with respect to oligarchies of the popular Lloyd’s method changes radically with the initialization technique.

Cite this Paper


BibTeX
@InProceedings{pmlr-v31-ackerman13a, title = {Clustering Oligarchies}, author = {Ackerman, Margareta and Ben-David, Shai and Loker, David and Sabato, Sivan}, booktitle = {Proceedings of the Sixteenth International Conference on Artificial Intelligence and Statistics}, pages = {66--74}, year = {2013}, editor = {Carvalho, Carlos M. and Ravikumar, Pradeep}, volume = {31}, series = {Proceedings of Machine Learning Research}, address = {Scottsdale, Arizona, USA}, month = {29 Apr--01 May}, publisher = {PMLR}, pdf = {http://proceedings.mlr.press/v31/ackerman13a.pdf}, url = {https://proceedings.mlr.press/v31/ackerman13a.html}, abstract = {We investigate the extent to which clustering algorithms are robust to the addition of a small, potentially adversarial, set of points. Our analysis reveals radical differences in the robustness of popular clustering methods. k-means and several related techniques are robust when data is clusterable, and we provide a quantitative analysis capturing the precise relationship between clusterability and robustness. In contrast, common linkage-based algorithms and several standard objective-function-based clustering methods can be highly sensitive to the addition of a small set of points even when the data is highly clusterable. We call such sets of points oligarchies. Lastly, we show that the behavior with respect to oligarchies of the popular Lloyd’s method changes radically with the initialization technique.} }
Endnote
%0 Conference Paper %T Clustering Oligarchies %A Margareta Ackerman %A Shai Ben-David %A David Loker %A Sivan Sabato %B Proceedings of the Sixteenth International Conference on Artificial Intelligence and Statistics %C Proceedings of Machine Learning Research %D 2013 %E Carlos M. Carvalho %E Pradeep Ravikumar %F pmlr-v31-ackerman13a %I PMLR %P 66--74 %U https://proceedings.mlr.press/v31/ackerman13a.html %V 31 %X We investigate the extent to which clustering algorithms are robust to the addition of a small, potentially adversarial, set of points. Our analysis reveals radical differences in the robustness of popular clustering methods. k-means and several related techniques are robust when data is clusterable, and we provide a quantitative analysis capturing the precise relationship between clusterability and robustness. In contrast, common linkage-based algorithms and several standard objective-function-based clustering methods can be highly sensitive to the addition of a small set of points even when the data is highly clusterable. We call such sets of points oligarchies. Lastly, we show that the behavior with respect to oligarchies of the popular Lloyd’s method changes radically with the initialization technique.
RIS
TY - CPAPER TI - Clustering Oligarchies AU - Margareta Ackerman AU - Shai Ben-David AU - David Loker AU - Sivan Sabato BT - Proceedings of the Sixteenth International Conference on Artificial Intelligence and Statistics DA - 2013/04/29 ED - Carlos M. Carvalho ED - Pradeep Ravikumar ID - pmlr-v31-ackerman13a PB - PMLR DP - Proceedings of Machine Learning Research VL - 31 SP - 66 EP - 74 L1 - http://proceedings.mlr.press/v31/ackerman13a.pdf UR - https://proceedings.mlr.press/v31/ackerman13a.html AB - We investigate the extent to which clustering algorithms are robust to the addition of a small, potentially adversarial, set of points. Our analysis reveals radical differences in the robustness of popular clustering methods. k-means and several related techniques are robust when data is clusterable, and we provide a quantitative analysis capturing the precise relationship between clusterability and robustness. In contrast, common linkage-based algorithms and several standard objective-function-based clustering methods can be highly sensitive to the addition of a small set of points even when the data is highly clusterable. We call such sets of points oligarchies. Lastly, we show that the behavior with respect to oligarchies of the popular Lloyd’s method changes radically with the initialization technique. ER -
APA
Ackerman, M., Ben-David, S., Loker, D. & Sabato, S.. (2013). Clustering Oligarchies. Proceedings of the Sixteenth International Conference on Artificial Intelligence and Statistics, in Proceedings of Machine Learning Research 31:66-74 Available from https://proceedings.mlr.press/v31/ackerman13a.html.

Related Material