Strict Monotonicity of Sum of Squares Error and Normalized Cut in the Lattice of Clusterings

Nicola Rebagliati
Proceedings of the 30th International Conference on Machine Learning, PMLR 28(2):163-171, 2013.

Abstract

Sum of Squares Error and Normalized Cut are two widely used clustering functional. It is known their minimum values are monotone with respect to the input number of clusters and this monotonicity does not allow for a simple automatic selection of a correct number of clusters. Here we study monotonicity not just on the minimizers but on the entire clustering lattice. We show the value of Sum of Squares Error is strictly monotone under the strict refinement relation of clusterings and we obtain data-dependent bounds on the difference between the value of a clustering and one of its refinements. Using analogous techniques we show the value of Normalized Cut is strictly anti-monotone. These results imply that even if we restrict our solutions to form a chain of clustering, like the one we get from hierarchical algorithms, we cannot rely on the functional values in order to choose the number of clusters. By using these results we get some data-dependent bounds on the difference of the values of any two clusterings.

Cite this Paper


BibTeX
@InProceedings{pmlr-v28-rebagliati13, title = {Strict Monotonicity of Sum of Squares Error and Normalized Cut in the Lattice of Clusterings}, author = {Rebagliati, Nicola}, booktitle = {Proceedings of the 30th International Conference on Machine Learning}, pages = {163--171}, year = {2013}, editor = {Dasgupta, Sanjoy and McAllester, David}, volume = {28}, number = {2}, series = {Proceedings of Machine Learning Research}, address = {Atlanta, Georgia, USA}, month = {17--19 Jun}, publisher = {PMLR}, pdf = {http://proceedings.mlr.press/v28/rebagliati13.pdf}, url = {https://proceedings.mlr.press/v28/rebagliati13.html}, abstract = {Sum of Squares Error and Normalized Cut are two widely used clustering functional. It is known their minimum values are monotone with respect to the input number of clusters and this monotonicity does not allow for a simple automatic selection of a correct number of clusters. Here we study monotonicity not just on the minimizers but on the entire clustering lattice. We show the value of Sum of Squares Error is strictly monotone under the strict refinement relation of clusterings and we obtain data-dependent bounds on the difference between the value of a clustering and one of its refinements. Using analogous techniques we show the value of Normalized Cut is strictly anti-monotone. These results imply that even if we restrict our solutions to form a chain of clustering, like the one we get from hierarchical algorithms, we cannot rely on the functional values in order to choose the number of clusters. By using these results we get some data-dependent bounds on the difference of the values of any two clusterings.} }
Endnote
%0 Conference Paper %T Strict Monotonicity of Sum of Squares Error and Normalized Cut in the Lattice of Clusterings %A Nicola Rebagliati %B Proceedings of the 30th International Conference on Machine Learning %C Proceedings of Machine Learning Research %D 2013 %E Sanjoy Dasgupta %E David McAllester %F pmlr-v28-rebagliati13 %I PMLR %P 163--171 %U https://proceedings.mlr.press/v28/rebagliati13.html %V 28 %N 2 %X Sum of Squares Error and Normalized Cut are two widely used clustering functional. It is known their minimum values are monotone with respect to the input number of clusters and this monotonicity does not allow for a simple automatic selection of a correct number of clusters. Here we study monotonicity not just on the minimizers but on the entire clustering lattice. We show the value of Sum of Squares Error is strictly monotone under the strict refinement relation of clusterings and we obtain data-dependent bounds on the difference between the value of a clustering and one of its refinements. Using analogous techniques we show the value of Normalized Cut is strictly anti-monotone. These results imply that even if we restrict our solutions to form a chain of clustering, like the one we get from hierarchical algorithms, we cannot rely on the functional values in order to choose the number of clusters. By using these results we get some data-dependent bounds on the difference of the values of any two clusterings.
RIS
TY - CPAPER TI - Strict Monotonicity of Sum of Squares Error and Normalized Cut in the Lattice of Clusterings AU - Nicola Rebagliati BT - Proceedings of the 30th International Conference on Machine Learning DA - 2013/05/13 ED - Sanjoy Dasgupta ED - David McAllester ID - pmlr-v28-rebagliati13 PB - PMLR DP - Proceedings of Machine Learning Research VL - 28 IS - 2 SP - 163 EP - 171 L1 - http://proceedings.mlr.press/v28/rebagliati13.pdf UR - https://proceedings.mlr.press/v28/rebagliati13.html AB - Sum of Squares Error and Normalized Cut are two widely used clustering functional. It is known their minimum values are monotone with respect to the input number of clusters and this monotonicity does not allow for a simple automatic selection of a correct number of clusters. Here we study monotonicity not just on the minimizers but on the entire clustering lattice. We show the value of Sum of Squares Error is strictly monotone under the strict refinement relation of clusterings and we obtain data-dependent bounds on the difference between the value of a clustering and one of its refinements. Using analogous techniques we show the value of Normalized Cut is strictly anti-monotone. These results imply that even if we restrict our solutions to form a chain of clustering, like the one we get from hierarchical algorithms, we cannot rely on the functional values in order to choose the number of clusters. By using these results we get some data-dependent bounds on the difference of the values of any two clusterings. ER -
APA
Rebagliati, N.. (2013). Strict Monotonicity of Sum of Squares Error and Normalized Cut in the Lattice of Clusterings. Proceedings of the 30th International Conference on Machine Learning, in Proceedings of Machine Learning Research 28(2):163-171 Available from https://proceedings.mlr.press/v28/rebagliati13.html.

Related Material