Super Deep Contrastive Information Bottleneck for Multi-modal Clustering

Zhengzheng Lou, Ke Zhang, Yucong Wu, Shizhe Hu
Proceedings of the 42nd International Conference on Machine Learning, PMLR 267:40400-40410, 2025.

Abstract

In an era of increasingly diverse information sources, multi-modal clustering (MMC) has become a key technology for processing multi-modal data. It can apply and integrate the feature information and potential relationships of different modalities. Although there is a wealth of research on MMC, due to the complexity of datasets, a major challenge remains in how to deeply explore the complex latent information and interdependencies between modalities. To address this issue, this paper proposes a method called super deep contrastive information bottleneck (SDCIB) for MMC, which aims to explore and utilize all types of latent information to the fullest extent. Specifically, the proposed SDCIB explicitly introduces the rich information contained in the encoder’s hidden layers into the loss function for the first time, thoroughly mining both modal features and the hidden relationships between modalities. Moreover, the proposed SDCIB performs dual optimization by simultaneously considering consistency information from both the feature distribution and clustering assignment perspectives, the proposed SDCIB significantly improves clustering accuracy and robustness. We conducted experiments on 4 multi-modal datasets and the accuracy of the method on the ESP dataset improved by 9.3%. The results demonstrate the superiority and clever design of the proposed SDCIB. The source code is available on https://github.com/ShizheHu.

Cite this Paper


BibTeX
@InProceedings{pmlr-v267-lou25e, title = {Super Deep Contrastive Information Bottleneck for Multi-modal Clustering}, author = {Lou, Zhengzheng and Zhang, Ke and Wu, Yucong and Hu, Shizhe}, booktitle = {Proceedings of the 42nd International Conference on Machine Learning}, pages = {40400--40410}, year = {2025}, editor = {Singh, Aarti and Fazel, Maryam and Hsu, Daniel and Lacoste-Julien, Simon and Berkenkamp, Felix and Maharaj, Tegan and Wagstaff, Kiri and Zhu, Jerry}, volume = {267}, series = {Proceedings of Machine Learning Research}, month = {13--19 Jul}, publisher = {PMLR}, pdf = {https://raw.githubusercontent.com/mlresearch/v267/main/assets/lou25e/lou25e.pdf}, url = {https://proceedings.mlr.press/v267/lou25e.html}, abstract = {In an era of increasingly diverse information sources, multi-modal clustering (MMC) has become a key technology for processing multi-modal data. It can apply and integrate the feature information and potential relationships of different modalities. Although there is a wealth of research on MMC, due to the complexity of datasets, a major challenge remains in how to deeply explore the complex latent information and interdependencies between modalities. To address this issue, this paper proposes a method called super deep contrastive information bottleneck (SDCIB) for MMC, which aims to explore and utilize all types of latent information to the fullest extent. Specifically, the proposed SDCIB explicitly introduces the rich information contained in the encoder’s hidden layers into the loss function for the first time, thoroughly mining both modal features and the hidden relationships between modalities. Moreover, the proposed SDCIB performs dual optimization by simultaneously considering consistency information from both the feature distribution and clustering assignment perspectives, the proposed SDCIB significantly improves clustering accuracy and robustness. We conducted experiments on 4 multi-modal datasets and the accuracy of the method on the ESP dataset improved by 9.3%. The results demonstrate the superiority and clever design of the proposed SDCIB. The source code is available on https://github.com/ShizheHu.} }
Endnote
%0 Conference Paper %T Super Deep Contrastive Information Bottleneck for Multi-modal Clustering %A Zhengzheng Lou %A Ke Zhang %A Yucong Wu %A Shizhe Hu %B Proceedings of the 42nd International Conference on Machine Learning %C Proceedings of Machine Learning Research %D 2025 %E Aarti Singh %E Maryam Fazel %E Daniel Hsu %E Simon Lacoste-Julien %E Felix Berkenkamp %E Tegan Maharaj %E Kiri Wagstaff %E Jerry Zhu %F pmlr-v267-lou25e %I PMLR %P 40400--40410 %U https://proceedings.mlr.press/v267/lou25e.html %V 267 %X In an era of increasingly diverse information sources, multi-modal clustering (MMC) has become a key technology for processing multi-modal data. It can apply and integrate the feature information and potential relationships of different modalities. Although there is a wealth of research on MMC, due to the complexity of datasets, a major challenge remains in how to deeply explore the complex latent information and interdependencies between modalities. To address this issue, this paper proposes a method called super deep contrastive information bottleneck (SDCIB) for MMC, which aims to explore and utilize all types of latent information to the fullest extent. Specifically, the proposed SDCIB explicitly introduces the rich information contained in the encoder’s hidden layers into the loss function for the first time, thoroughly mining both modal features and the hidden relationships between modalities. Moreover, the proposed SDCIB performs dual optimization by simultaneously considering consistency information from both the feature distribution and clustering assignment perspectives, the proposed SDCIB significantly improves clustering accuracy and robustness. We conducted experiments on 4 multi-modal datasets and the accuracy of the method on the ESP dataset improved by 9.3%. The results demonstrate the superiority and clever design of the proposed SDCIB. The source code is available on https://github.com/ShizheHu.
APA
Lou, Z., Zhang, K., Wu, Y. & Hu, S.. (2025). Super Deep Contrastive Information Bottleneck for Multi-modal Clustering. Proceedings of the 42nd International Conference on Machine Learning, in Proceedings of Machine Learning Research 267:40400-40410 Available from https://proceedings.mlr.press/v267/lou25e.html.

Related Material