Exactly Tight Information-theoretic Generalization Bounds via Binary Jensen-Shannon Divergence

Yuxin Dong, Haoran Guo, Tieliang Gong, Wen Wen, Chen Li
Proceedings of the 42nd International Conference on Machine Learning, PMLR 267:14040-14060, 2025.

Abstract

Information-theoretic bounds, while achieving significant success in analyzing the generalization of randomized learning algorithms, have been criticized for their slow convergence rates and overestimation. This paper presents novel bounds that bridge the expected empirical and population risks through a binarized variant of the Jensen-Shannon divergence. Leveraging our foundational lemma that characterizes the interaction between an arbitrary and a binary variable, we derive hypothesis-based bounds that enhance existing conditional mutual information bounds by reducing the number of conditioned samples from $2$ to $1$. We additionally establish prediction-based bounds that surpass prior bounds based on evaluated loss mutual information measures. Thereafter, through a new binarization technique for the evaluated loss variables, we obtain exactly tight generalization bounds broadly applicable to general randomized learning algorithms for any bounded loss functions. Our results effectively address key limitations of previous results in analyzing certain stochastic convex optimization problems, without requiring additional stability or compressibility assumptions about the learning algorithm.

Cite this Paper


BibTeX
@InProceedings{pmlr-v267-dong25e, title = {Exactly Tight Information-theoretic Generalization Bounds via Binary Jensen-Shannon Divergence}, author = {Dong, Yuxin and Guo, Haoran and Gong, Tieliang and Wen, Wen and Li, Chen}, booktitle = {Proceedings of the 42nd International Conference on Machine Learning}, pages = {14040--14060}, year = {2025}, editor = {Singh, Aarti and Fazel, Maryam and Hsu, Daniel and Lacoste-Julien, Simon and Berkenkamp, Felix and Maharaj, Tegan and Wagstaff, Kiri and Zhu, Jerry}, volume = {267}, series = {Proceedings of Machine Learning Research}, month = {13--19 Jul}, publisher = {PMLR}, pdf = {https://raw.githubusercontent.com/mlresearch/v267/main/assets/dong25e/dong25e.pdf}, url = {https://proceedings.mlr.press/v267/dong25e.html}, abstract = {Information-theoretic bounds, while achieving significant success in analyzing the generalization of randomized learning algorithms, have been criticized for their slow convergence rates and overestimation. This paper presents novel bounds that bridge the expected empirical and population risks through a binarized variant of the Jensen-Shannon divergence. Leveraging our foundational lemma that characterizes the interaction between an arbitrary and a binary variable, we derive hypothesis-based bounds that enhance existing conditional mutual information bounds by reducing the number of conditioned samples from $2$ to $1$. We additionally establish prediction-based bounds that surpass prior bounds based on evaluated loss mutual information measures. Thereafter, through a new binarization technique for the evaluated loss variables, we obtain exactly tight generalization bounds broadly applicable to general randomized learning algorithms for any bounded loss functions. Our results effectively address key limitations of previous results in analyzing certain stochastic convex optimization problems, without requiring additional stability or compressibility assumptions about the learning algorithm.} }
Endnote
%0 Conference Paper %T Exactly Tight Information-theoretic Generalization Bounds via Binary Jensen-Shannon Divergence %A Yuxin Dong %A Haoran Guo %A Tieliang Gong %A Wen Wen %A Chen Li %B Proceedings of the 42nd International Conference on Machine Learning %C Proceedings of Machine Learning Research %D 2025 %E Aarti Singh %E Maryam Fazel %E Daniel Hsu %E Simon Lacoste-Julien %E Felix Berkenkamp %E Tegan Maharaj %E Kiri Wagstaff %E Jerry Zhu %F pmlr-v267-dong25e %I PMLR %P 14040--14060 %U https://proceedings.mlr.press/v267/dong25e.html %V 267 %X Information-theoretic bounds, while achieving significant success in analyzing the generalization of randomized learning algorithms, have been criticized for their slow convergence rates and overestimation. This paper presents novel bounds that bridge the expected empirical and population risks through a binarized variant of the Jensen-Shannon divergence. Leveraging our foundational lemma that characterizes the interaction between an arbitrary and a binary variable, we derive hypothesis-based bounds that enhance existing conditional mutual information bounds by reducing the number of conditioned samples from $2$ to $1$. We additionally establish prediction-based bounds that surpass prior bounds based on evaluated loss mutual information measures. Thereafter, through a new binarization technique for the evaluated loss variables, we obtain exactly tight generalization bounds broadly applicable to general randomized learning algorithms for any bounded loss functions. Our results effectively address key limitations of previous results in analyzing certain stochastic convex optimization problems, without requiring additional stability or compressibility assumptions about the learning algorithm.
APA
Dong, Y., Guo, H., Gong, T., Wen, W. & Li, C.. (2025). Exactly Tight Information-theoretic Generalization Bounds via Binary Jensen-Shannon Divergence. Proceedings of the 42nd International Conference on Machine Learning, in Proceedings of Machine Learning Research 267:14040-14060 Available from https://proceedings.mlr.press/v267/dong25e.html.

Related Material