Ordered $\mathcalV$-information Growth: A Fresh Perspective on Shared Information

Rohan Ghosh, Mehul Motani
Proceedings of The 28th International Conference on Artificial Intelligence and Statistics, PMLR 258:793-801, 2025.

Abstract

Mutual information (MI) is widely employed as a measure of shared information between random variables. However, MI assumes unbounded computational resources—a condition rarely met in practice, where predicting a random variable $Y$ from $X$ must rely on finite resources. $\mathcal{V}$-information addresses this limitation by employing a predictive family $\mathcal{V}$ to emulate computational constraints, yielding a directed measure of shared information. Focusing on the mixed setting (continuous $X$ and discrete $Y$), here we highlight the upward bias of empirical $\mathcal{V}$-information, $\hat I_{\mathcal{V}}(X \rightarrow Y)$, even when $\mathcal{V}$ is low-complexity (e.g., shallow neural networks). To mitigate this bias, we introduce $\mathcal{V}$-Information Growth (VI-Growth), defined as $\\hat I_{\mathcal{V}}(X \rightarrow Y) - \hat I_{\mathcal{V}}(X’ \rightarrow Y’)$, where $X’, Y’ \sim P_X P_Y$ represent independent variables. While VI-Growth effectively counters over-estimation, more complex predictive families may lead to under-estimation. To address this, we construct a sequence of predictive families $\mathcal{V}_1, \mathcal{V}_2, \ldots, \mathcal{V}$ of increasing complexity and compute the maximum of VI-Growth across these families, yielding the ordered VI-Growth (O-VIG). We provide theoretical results that justify this approach, showing that O-VIG is a provably tighter lower bound for the true $\mathcal{V}$-Information than empirical $\mathcal{V}$-Information itself, and exhibits stronger convergence properties than $\mathcal{V}$-Information. Empirically, O-VIG alleviates bias and consistently outperforms state-of-the-art methods in both MI estimation and dataset complexity estimation, demonstrating its practical utility.

Cite this Paper


BibTeX
@InProceedings{pmlr-v258-ghosh25a, title = {Ordered $\mathcal{V}$-information Growth: A Fresh Perspective on Shared Information}, author = {Ghosh, Rohan and Motani, Mehul}, booktitle = {Proceedings of The 28th International Conference on Artificial Intelligence and Statistics}, pages = {793--801}, year = {2025}, editor = {Li, Yingzhen and Mandt, Stephan and Agrawal, Shipra and Khan, Emtiyaz}, volume = {258}, series = {Proceedings of Machine Learning Research}, month = {03--05 May}, publisher = {PMLR}, pdf = {https://raw.githubusercontent.com/mlresearch/v258/main/assets/ghosh25a/ghosh25a.pdf}, url = {https://proceedings.mlr.press/v258/ghosh25a.html}, abstract = {Mutual information (MI) is widely employed as a measure of shared information between random variables. However, MI assumes unbounded computational resources—a condition rarely met in practice, where predicting a random variable $Y$ from $X$ must rely on finite resources. $\mathcal{V}$-information addresses this limitation by employing a predictive family $\mathcal{V}$ to emulate computational constraints, yielding a directed measure of shared information. Focusing on the mixed setting (continuous $X$ and discrete $Y$), here we highlight the upward bias of empirical $\mathcal{V}$-information, $\hat I_{\mathcal{V}}(X \rightarrow Y)$, even when $\mathcal{V}$ is low-complexity (e.g., shallow neural networks). To mitigate this bias, we introduce $\mathcal{V}$-Information Growth (VI-Growth), defined as $\\hat I_{\mathcal{V}}(X \rightarrow Y) - \hat I_{\mathcal{V}}(X’ \rightarrow Y’)$, where $X’, Y’ \sim P_X P_Y$ represent independent variables. While VI-Growth effectively counters over-estimation, more complex predictive families may lead to under-estimation. To address this, we construct a sequence of predictive families $\mathcal{V}_1, \mathcal{V}_2, \ldots, \mathcal{V}$ of increasing complexity and compute the maximum of VI-Growth across these families, yielding the ordered VI-Growth (O-VIG). We provide theoretical results that justify this approach, showing that O-VIG is a provably tighter lower bound for the true $\mathcal{V}$-Information than empirical $\mathcal{V}$-Information itself, and exhibits stronger convergence properties than $\mathcal{V}$-Information. Empirically, O-VIG alleviates bias and consistently outperforms state-of-the-art methods in both MI estimation and dataset complexity estimation, demonstrating its practical utility.} }
Endnote
%0 Conference Paper %T Ordered $\mathcalV$-information Growth: A Fresh Perspective on Shared Information %A Rohan Ghosh %A Mehul Motani %B Proceedings of The 28th International Conference on Artificial Intelligence and Statistics %C Proceedings of Machine Learning Research %D 2025 %E Yingzhen Li %E Stephan Mandt %E Shipra Agrawal %E Emtiyaz Khan %F pmlr-v258-ghosh25a %I PMLR %P 793--801 %U https://proceedings.mlr.press/v258/ghosh25a.html %V 258 %X Mutual information (MI) is widely employed as a measure of shared information between random variables. However, MI assumes unbounded computational resources—a condition rarely met in practice, where predicting a random variable $Y$ from $X$ must rely on finite resources. $\mathcal{V}$-information addresses this limitation by employing a predictive family $\mathcal{V}$ to emulate computational constraints, yielding a directed measure of shared information. Focusing on the mixed setting (continuous $X$ and discrete $Y$), here we highlight the upward bias of empirical $\mathcal{V}$-information, $\hat I_{\mathcal{V}}(X \rightarrow Y)$, even when $\mathcal{V}$ is low-complexity (e.g., shallow neural networks). To mitigate this bias, we introduce $\mathcal{V}$-Information Growth (VI-Growth), defined as $\\hat I_{\mathcal{V}}(X \rightarrow Y) - \hat I_{\mathcal{V}}(X’ \rightarrow Y’)$, where $X’, Y’ \sim P_X P_Y$ represent independent variables. While VI-Growth effectively counters over-estimation, more complex predictive families may lead to under-estimation. To address this, we construct a sequence of predictive families $\mathcal{V}_1, \mathcal{V}_2, \ldots, \mathcal{V}$ of increasing complexity and compute the maximum of VI-Growth across these families, yielding the ordered VI-Growth (O-VIG). We provide theoretical results that justify this approach, showing that O-VIG is a provably tighter lower bound for the true $\mathcal{V}$-Information than empirical $\mathcal{V}$-Information itself, and exhibits stronger convergence properties than $\mathcal{V}$-Information. Empirically, O-VIG alleviates bias and consistently outperforms state-of-the-art methods in both MI estimation and dataset complexity estimation, demonstrating its practical utility.
APA
Ghosh, R. & Motani, M.. (2025). Ordered $\mathcalV$-information Growth: A Fresh Perspective on Shared Information. Proceedings of The 28th International Conference on Artificial Intelligence and Statistics, in Proceedings of Machine Learning Research 258:793-801 Available from https://proceedings.mlr.press/v258/ghosh25a.html.

Related Material