Stabilizing Sample Similarity in Representation via Mitigating Random Consistency

Jieting Wang, Zelong Zhang, Feijiang Li, Yuhua Qian, Xinyan Liang
Proceedings of the 42nd International Conference on Machine Learning, PMLR 267:65655-65686, 2025.

Abstract

Deep learning excels at capturing complex data representations, yet quantifying the discriminative quality of these representations remains challenging. While unsupervised metrics often assess pairwise sample similarity, classification tasks fundamentally require class-level discrimination. To bridge this gap, we propose a novel loss function that evaluates representation discriminability via the Euclidean distance between the learned similarity matrix and the true class adjacency matrix. We identify random consistency—an inherent bias in Euclidean distance metrics—as a key obstacle to reliable evaluation, affecting both fairness and discrimination. To address this, we derive the expected Euclidean distance under uniformly distributed label permutations and introduce its closed-form solution, the Pure Square Euclidean Distance (PSED), which provably eliminates random consistency. Theoretically, we demonstrate that PSED satisfies heterogeneity and unbiasedness guarantees, and establish its generalization bound via the exponential Orlicz norm, confirming its statistical learnability. Empirically, our method surpasses conventional loss functions across multiple benchmarks, achieving significant improvements in accuracy, $F_1$ score, and class-structure differentiation. (Code is published in https://github.com/FeijiangLi/ICML2025-PSED)

Cite this Paper


BibTeX
@InProceedings{pmlr-v267-wang25ev, title = {Stabilizing Sample Similarity in Representation via Mitigating Random Consistency}, author = {Wang, Jieting and Zhang, Zelong and Li, Feijiang and Qian, Yuhua and Liang, Xinyan}, booktitle = {Proceedings of the 42nd International Conference on Machine Learning}, pages = {65655--65686}, year = {2025}, editor = {Singh, Aarti and Fazel, Maryam and Hsu, Daniel and Lacoste-Julien, Simon and Berkenkamp, Felix and Maharaj, Tegan and Wagstaff, Kiri and Zhu, Jerry}, volume = {267}, series = {Proceedings of Machine Learning Research}, month = {13--19 Jul}, publisher = {PMLR}, pdf = {https://raw.githubusercontent.com/mlresearch/v267/main/assets/wang25ev/wang25ev.pdf}, url = {https://proceedings.mlr.press/v267/wang25ev.html}, abstract = {Deep learning excels at capturing complex data representations, yet quantifying the discriminative quality of these representations remains challenging. While unsupervised metrics often assess pairwise sample similarity, classification tasks fundamentally require class-level discrimination. To bridge this gap, we propose a novel loss function that evaluates representation discriminability via the Euclidean distance between the learned similarity matrix and the true class adjacency matrix. We identify random consistency—an inherent bias in Euclidean distance metrics—as a key obstacle to reliable evaluation, affecting both fairness and discrimination. To address this, we derive the expected Euclidean distance under uniformly distributed label permutations and introduce its closed-form solution, the Pure Square Euclidean Distance (PSED), which provably eliminates random consistency. Theoretically, we demonstrate that PSED satisfies heterogeneity and unbiasedness guarantees, and establish its generalization bound via the exponential Orlicz norm, confirming its statistical learnability. Empirically, our method surpasses conventional loss functions across multiple benchmarks, achieving significant improvements in accuracy, $F_1$ score, and class-structure differentiation. (Code is published in https://github.com/FeijiangLi/ICML2025-PSED)} }
Endnote
%0 Conference Paper %T Stabilizing Sample Similarity in Representation via Mitigating Random Consistency %A Jieting Wang %A Zelong Zhang %A Feijiang Li %A Yuhua Qian %A Xinyan Liang %B Proceedings of the 42nd International Conference on Machine Learning %C Proceedings of Machine Learning Research %D 2025 %E Aarti Singh %E Maryam Fazel %E Daniel Hsu %E Simon Lacoste-Julien %E Felix Berkenkamp %E Tegan Maharaj %E Kiri Wagstaff %E Jerry Zhu %F pmlr-v267-wang25ev %I PMLR %P 65655--65686 %U https://proceedings.mlr.press/v267/wang25ev.html %V 267 %X Deep learning excels at capturing complex data representations, yet quantifying the discriminative quality of these representations remains challenging. While unsupervised metrics often assess pairwise sample similarity, classification tasks fundamentally require class-level discrimination. To bridge this gap, we propose a novel loss function that evaluates representation discriminability via the Euclidean distance between the learned similarity matrix and the true class adjacency matrix. We identify random consistency—an inherent bias in Euclidean distance metrics—as a key obstacle to reliable evaluation, affecting both fairness and discrimination. To address this, we derive the expected Euclidean distance under uniformly distributed label permutations and introduce its closed-form solution, the Pure Square Euclidean Distance (PSED), which provably eliminates random consistency. Theoretically, we demonstrate that PSED satisfies heterogeneity and unbiasedness guarantees, and establish its generalization bound via the exponential Orlicz norm, confirming its statistical learnability. Empirically, our method surpasses conventional loss functions across multiple benchmarks, achieving significant improvements in accuracy, $F_1$ score, and class-structure differentiation. (Code is published in https://github.com/FeijiangLi/ICML2025-PSED)
APA
Wang, J., Zhang, Z., Li, F., Qian, Y. & Liang, X.. (2025). Stabilizing Sample Similarity in Representation via Mitigating Random Consistency. Proceedings of the 42nd International Conference on Machine Learning, in Proceedings of Machine Learning Research 267:65655-65686 Available from https://proceedings.mlr.press/v267/wang25ev.html.

Related Material