On the price of explainability for some clustering problems

Eduardo S Laber, Lucas Murtinho
Proceedings of the 38th International Conference on Machine Learning, PMLR 139:5915-5925, 2021.

Abstract

The price of explainability for a clustering task can be defined as the unavoidable loss, in terms of the objective function, if we force the final partition to be explainable. Here, we study this price for the following clustering problems: $k$-means, $k$-medians, $k$-centers and maximum-spacing. We provide upper and lower bounds for a natural model where explainability is achieved via decision trees. For the $k$-means and $k$-medians problems our upper bounds improve those obtained by [Dasgupta et. al, ICML 20] for low dimensions. Another contribution is a simple and efficient algorithm for building explainable clusterings for the $k$-means problem. We provide empirical evidence that its performance is better than the current state of the art for decision-tree based explainable clustering.

Cite this Paper


BibTeX
@InProceedings{pmlr-v139-laber21a, title = {On the price of explainability for some clustering problems}, author = {Laber, Eduardo S and Murtinho, Lucas}, booktitle = {Proceedings of the 38th International Conference on Machine Learning}, pages = {5915--5925}, year = {2021}, editor = {Meila, Marina and Zhang, Tong}, volume = {139}, series = {Proceedings of Machine Learning Research}, month = {18--24 Jul}, publisher = {PMLR}, pdf = {http://proceedings.mlr.press/v139/laber21a/laber21a.pdf}, url = {https://proceedings.mlr.press/v139/laber21a.html}, abstract = {The price of explainability for a clustering task can be defined as the unavoidable loss, in terms of the objective function, if we force the final partition to be explainable. Here, we study this price for the following clustering problems: $k$-means, $k$-medians, $k$-centers and maximum-spacing. We provide upper and lower bounds for a natural model where explainability is achieved via decision trees. For the $k$-means and $k$-medians problems our upper bounds improve those obtained by [Dasgupta et. al, ICML 20] for low dimensions. Another contribution is a simple and efficient algorithm for building explainable clusterings for the $k$-means problem. We provide empirical evidence that its performance is better than the current state of the art for decision-tree based explainable clustering.} }
Endnote
%0 Conference Paper %T On the price of explainability for some clustering problems %A Eduardo S Laber %A Lucas Murtinho %B Proceedings of the 38th International Conference on Machine Learning %C Proceedings of Machine Learning Research %D 2021 %E Marina Meila %E Tong Zhang %F pmlr-v139-laber21a %I PMLR %P 5915--5925 %U https://proceedings.mlr.press/v139/laber21a.html %V 139 %X The price of explainability for a clustering task can be defined as the unavoidable loss, in terms of the objective function, if we force the final partition to be explainable. Here, we study this price for the following clustering problems: $k$-means, $k$-medians, $k$-centers and maximum-spacing. We provide upper and lower bounds for a natural model where explainability is achieved via decision trees. For the $k$-means and $k$-medians problems our upper bounds improve those obtained by [Dasgupta et. al, ICML 20] for low dimensions. Another contribution is a simple and efficient algorithm for building explainable clusterings for the $k$-means problem. We provide empirical evidence that its performance is better than the current state of the art for decision-tree based explainable clustering.
APA
Laber, E.S. & Murtinho, L.. (2021). On the price of explainability for some clustering problems. Proceedings of the 38th International Conference on Machine Learning, in Proceedings of Machine Learning Research 139:5915-5925 Available from https://proceedings.mlr.press/v139/laber21a.html.

Related Material