Learning from Sample Stability for Deep Clustering

Zhixin Li, Yuheng Jia, Hui Liu, Junhui Hou
Proceedings of the 42nd International Conference on Machine Learning, PMLR 267:34904-34919, 2025.

Abstract

Deep clustering, an unsupervised technique independent of labels, necessitates tailored supervision for model training. Prior methods explore supervision like similarity and pseudo labels, yet overlook individual sample training analysis. Our study correlates sample stability during unsupervised training with clustering accuracy and network memorization on a per-sample basis. Unstable representations across epochs often lead to mispredictions, indicating difficulty in memorization and atypicality. Leveraging these findings, we introduce supervision signals for the first time based on sample stability at the representation level. Our proposed strategy serves as a versatile tool to enhance various deep clustering techniques. Experiments across benchmark datasets showcase that incorporating sample stability into training can improve the performance of deep clustering. The code is available at https://github.com/LZX-001/LFSS.

Cite this Paper


BibTeX
@InProceedings{pmlr-v267-li25am, title = {Learning from Sample Stability for Deep Clustering}, author = {Li, Zhixin and Jia, Yuheng and Liu, Hui and Hou, Junhui}, booktitle = {Proceedings of the 42nd International Conference on Machine Learning}, pages = {34904--34919}, year = {2025}, editor = {Singh, Aarti and Fazel, Maryam and Hsu, Daniel and Lacoste-Julien, Simon and Berkenkamp, Felix and Maharaj, Tegan and Wagstaff, Kiri and Zhu, Jerry}, volume = {267}, series = {Proceedings of Machine Learning Research}, month = {13--19 Jul}, publisher = {PMLR}, pdf = {https://raw.githubusercontent.com/mlresearch/v267/main/assets/li25am/li25am.pdf}, url = {https://proceedings.mlr.press/v267/li25am.html}, abstract = {Deep clustering, an unsupervised technique independent of labels, necessitates tailored supervision for model training. Prior methods explore supervision like similarity and pseudo labels, yet overlook individual sample training analysis. Our study correlates sample stability during unsupervised training with clustering accuracy and network memorization on a per-sample basis. Unstable representations across epochs often lead to mispredictions, indicating difficulty in memorization and atypicality. Leveraging these findings, we introduce supervision signals for the first time based on sample stability at the representation level. Our proposed strategy serves as a versatile tool to enhance various deep clustering techniques. Experiments across benchmark datasets showcase that incorporating sample stability into training can improve the performance of deep clustering. The code is available at https://github.com/LZX-001/LFSS.} }
Endnote
%0 Conference Paper %T Learning from Sample Stability for Deep Clustering %A Zhixin Li %A Yuheng Jia %A Hui Liu %A Junhui Hou %B Proceedings of the 42nd International Conference on Machine Learning %C Proceedings of Machine Learning Research %D 2025 %E Aarti Singh %E Maryam Fazel %E Daniel Hsu %E Simon Lacoste-Julien %E Felix Berkenkamp %E Tegan Maharaj %E Kiri Wagstaff %E Jerry Zhu %F pmlr-v267-li25am %I PMLR %P 34904--34919 %U https://proceedings.mlr.press/v267/li25am.html %V 267 %X Deep clustering, an unsupervised technique independent of labels, necessitates tailored supervision for model training. Prior methods explore supervision like similarity and pseudo labels, yet overlook individual sample training analysis. Our study correlates sample stability during unsupervised training with clustering accuracy and network memorization on a per-sample basis. Unstable representations across epochs often lead to mispredictions, indicating difficulty in memorization and atypicality. Leveraging these findings, we introduce supervision signals for the first time based on sample stability at the representation level. Our proposed strategy serves as a versatile tool to enhance various deep clustering techniques. Experiments across benchmark datasets showcase that incorporating sample stability into training can improve the performance of deep clustering. The code is available at https://github.com/LZX-001/LFSS.
APA
Li, Z., Jia, Y., Liu, H. & Hou, J.. (2025). Learning from Sample Stability for Deep Clustering. Proceedings of the 42nd International Conference on Machine Learning, in Proceedings of Machine Learning Research 267:34904-34919 Available from https://proceedings.mlr.press/v267/li25am.html.

Related Material