Consistency of Multiple Kernel Clustering

Weixuan Liang, Xinwang Liu, Yong Liu, Chuan Ma, Yunping Zhao, Zhe Liu, En Zhu
Proceedings of the 40th International Conference on Machine Learning, PMLR 202:20650-20676, 2023.

Abstract

Consistency plays an important role in learning theory. However, in multiple kernel clustering (MKC), the consistency of kernel weights has not been sufficiently investigated. In this work, we fill this gap with a non-asymptotic analysis on the consistency of kernel weights of a novel method termed SimpleMKKM. Under the assumptions of the eigenvalue gap, we give an infinity norm bound as $\widetilde{\mathcal{O}}(k/\sqrt{n})$, where $k$ is the number of clusters and $n$ is the number of samples. On this basis, we establish an upper bound for the excess clustering risk. Moreover, we study the difference of the kernel weights learned from $n$ samples and $r$ points sampled without replacement, and derive its upper bound as $\widetilde{\mathcal{O}}(k\cdot\sqrt{1/r-1/n})$. Based on the above results, we propose a novel strategy with Nyström method to enable SimpleMKKM to handle large-scale datasets with a theoretical learning guarantee. Finally, extensive experiments are conducted to verify the theoretical results and the effectiveness of the proposed large-scale strategy.

Cite this Paper


BibTeX
@InProceedings{pmlr-v202-liang23b, title = {Consistency of Multiple Kernel Clustering}, author = {Liang, Weixuan and Liu, Xinwang and Liu, Yong and Ma, Chuan and Zhao, Yunping and Liu, Zhe and Zhu, En}, booktitle = {Proceedings of the 40th International Conference on Machine Learning}, pages = {20650--20676}, year = {2023}, editor = {Krause, Andreas and Brunskill, Emma and Cho, Kyunghyun and Engelhardt, Barbara and Sabato, Sivan and Scarlett, Jonathan}, volume = {202}, series = {Proceedings of Machine Learning Research}, month = {23--29 Jul}, publisher = {PMLR}, pdf = {https://proceedings.mlr.press/v202/liang23b/liang23b.pdf}, url = {https://proceedings.mlr.press/v202/liang23b.html}, abstract = {Consistency plays an important role in learning theory. However, in multiple kernel clustering (MKC), the consistency of kernel weights has not been sufficiently investigated. In this work, we fill this gap with a non-asymptotic analysis on the consistency of kernel weights of a novel method termed SimpleMKKM. Under the assumptions of the eigenvalue gap, we give an infinity norm bound as $\widetilde{\mathcal{O}}(k/\sqrt{n})$, where $k$ is the number of clusters and $n$ is the number of samples. On this basis, we establish an upper bound for the excess clustering risk. Moreover, we study the difference of the kernel weights learned from $n$ samples and $r$ points sampled without replacement, and derive its upper bound as $\widetilde{\mathcal{O}}(k\cdot\sqrt{1/r-1/n})$. Based on the above results, we propose a novel strategy with Nyström method to enable SimpleMKKM to handle large-scale datasets with a theoretical learning guarantee. Finally, extensive experiments are conducted to verify the theoretical results and the effectiveness of the proposed large-scale strategy.} }
Endnote
%0 Conference Paper %T Consistency of Multiple Kernel Clustering %A Weixuan Liang %A Xinwang Liu %A Yong Liu %A Chuan Ma %A Yunping Zhao %A Zhe Liu %A En Zhu %B Proceedings of the 40th International Conference on Machine Learning %C Proceedings of Machine Learning Research %D 2023 %E Andreas Krause %E Emma Brunskill %E Kyunghyun Cho %E Barbara Engelhardt %E Sivan Sabato %E Jonathan Scarlett %F pmlr-v202-liang23b %I PMLR %P 20650--20676 %U https://proceedings.mlr.press/v202/liang23b.html %V 202 %X Consistency plays an important role in learning theory. However, in multiple kernel clustering (MKC), the consistency of kernel weights has not been sufficiently investigated. In this work, we fill this gap with a non-asymptotic analysis on the consistency of kernel weights of a novel method termed SimpleMKKM. Under the assumptions of the eigenvalue gap, we give an infinity norm bound as $\widetilde{\mathcal{O}}(k/\sqrt{n})$, where $k$ is the number of clusters and $n$ is the number of samples. On this basis, we establish an upper bound for the excess clustering risk. Moreover, we study the difference of the kernel weights learned from $n$ samples and $r$ points sampled without replacement, and derive its upper bound as $\widetilde{\mathcal{O}}(k\cdot\sqrt{1/r-1/n})$. Based on the above results, we propose a novel strategy with Nyström method to enable SimpleMKKM to handle large-scale datasets with a theoretical learning guarantee. Finally, extensive experiments are conducted to verify the theoretical results and the effectiveness of the proposed large-scale strategy.
APA
Liang, W., Liu, X., Liu, Y., Ma, C., Zhao, Y., Liu, Z. & Zhu, E.. (2023). Consistency of Multiple Kernel Clustering. Proceedings of the 40th International Conference on Machine Learning, in Proceedings of Machine Learning Research 202:20650-20676 Available from https://proceedings.mlr.press/v202/liang23b.html.

Related Material