Learning from Sample Stability for Deep Clustering

Zhixin Li; Yuheng Jia; Hui Liu; Junhui Hou

Learning from Sample Stability for Deep Clustering

Zhixin Li, Yuheng Jia, Hui Liu, Junhui Hou

Proceedings of the 42nd International Conference on Machine Learning, PMLR 267:34904-34919, 2025.

Abstract

Deep clustering, an unsupervised technique independent of labels, necessitates tailored supervision for model training. Prior methods explore supervision like similarity and pseudo labels, yet overlook individual sample training analysis. Our study correlates sample stability during unsupervised training with clustering accuracy and network memorization on a per-sample basis. Unstable representations across epochs often lead to mispredictions, indicating difficulty in memorization and atypicality. Leveraging these findings, we introduce supervision signals for the first time based on sample stability at the representation level. Our proposed strategy serves as a versatile tool to enhance various deep clustering techniques. Experiments across benchmark datasets showcase that incorporating sample stability into training can improve the performance of deep clustering. The code is available at https://github.com/LZX-001/LFSS.

Cite this Paper

BibTeX

@InProceedings{pmlr-v267-li25am,
  title = 	 {Learning from Sample Stability for Deep Clustering},
  author =       {Li, Zhixin and Jia, Yuheng and Liu, Hui and Hou, Junhui},
  booktitle = 	 {Proceedings of the 42nd International Conference on Machine Learning},
  pages = 	 {34904--34919},
  year = 	 {2025},
  editor = 	 {Singh, Aarti and Fazel, Maryam and Hsu, Daniel and Lacoste-Julien, Simon and Berkenkamp, Felix and Maharaj, Tegan and Wagstaff, Kiri and Zhu, Jerry},
  volume = 	 {267},
  series = 	 {Proceedings of Machine Learning Research},
  month = 	 {13--19 Jul},
  publisher =    {PMLR},
  pdf = 	 {https://raw.githubusercontent.com/mlresearch/v267/main/assets/li25am/li25am.pdf},
  url = 	 {https://proceedings.mlr.press/v267/li25am.html},
  abstract = 	 {Deep clustering, an unsupervised technique independent of labels, necessitates tailored supervision for model training. Prior methods explore supervision like similarity and pseudo labels, yet overlook individual sample training analysis. Our study correlates sample stability during unsupervised training with clustering accuracy and network memorization on a per-sample basis. Unstable representations across epochs often lead to mispredictions, indicating difficulty in memorization and atypicality. Leveraging these findings, we introduce supervision signals for the first time based on sample stability at the representation level. Our proposed strategy serves as a versatile tool to enhance various deep clustering techniques. Experiments across benchmark datasets showcase that incorporating sample stability into training can improve the performance of deep clustering. The code is available at https://github.com/LZX-001/LFSS.}
}

Endnote

%0 Conference Paper
%T Learning from Sample Stability for Deep Clustering
%A Zhixin Li
%A Yuheng Jia
%A Hui Liu
%A Junhui Hou
%B Proceedings of the 42nd International Conference on Machine Learning
%C Proceedings of Machine Learning Research
%D 2025
%E Aarti Singh
%E Maryam Fazel
%E Daniel Hsu
%E Simon Lacoste-Julien
%E Felix Berkenkamp
%E Tegan Maharaj
%E Kiri Wagstaff
%E Jerry Zhu	
%F pmlr-v267-li25am
%I PMLR
%P 34904--34919
%U https://proceedings.mlr.press/v267/li25am.html
%V 267
%X Deep clustering, an unsupervised technique independent of labels, necessitates tailored supervision for model training. Prior methods explore supervision like similarity and pseudo labels, yet overlook individual sample training analysis. Our study correlates sample stability during unsupervised training with clustering accuracy and network memorization on a per-sample basis. Unstable representations across epochs often lead to mispredictions, indicating difficulty in memorization and atypicality. Leveraging these findings, we introduce supervision signals for the first time based on sample stability at the representation level. Our proposed strategy serves as a versatile tool to enhance various deep clustering techniques. Experiments across benchmark datasets showcase that incorporating sample stability into training can improve the performance of deep clustering. The code is available at https://github.com/LZX-001/LFSS.

APA

Li, Z., Jia, Y., Liu, H. & Hou, J.. (2025). Learning from Sample Stability for Deep Clustering. Proceedings of the 42nd International Conference on Machine Learning, in Proceedings of Machine Learning Research 267:34904-34919 Available from https://proceedings.mlr.press/v267/li25am.html.

Learning from Sample Stability for Deep Clustering

Abstract

Cite this Paper

Related Material