Ordered $\mathcalV$-information Growth: A Fresh Perspective on Shared Information

Rohan Ghosh; Mehul Motani

Ordered $\mathcalV$-information Growth: A Fresh Perspective on Shared Information

Rohan Ghosh, Mehul Motani

Proceedings of The 28th International Conference on Artificial Intelligence and Statistics, PMLR 258:793-801, 2025.

Abstract

Mutual information (MI) is widely employed as a measure of shared information between random variables. However, MI assumes unbounded computational resources—a condition rarely met in practice, where predicting a random variable $Y$ from $X$ must rely on finite resources. $\mathcal{V}$-information addresses this limitation by employing a predictive family $\mathcal{V}$ to emulate computational constraints, yielding a directed measure of shared information. Focusing on the mixed setting (continuous $X$ and discrete $Y$), here we highlight the upward bias of empirical $\mathcal{V}$-information, $\hat I_{\mathcal{V}}(X \rightarrow Y)$, even when $\mathcal{V}$ is low-complexity (e.g., shallow neural networks). To mitigate this bias, we introduce $\mathcal{V}$-Information Growth (VI-Growth), defined as $\\hat I_{\mathcal{V}}(X \rightarrow Y) - \hat I_{\mathcal{V}}(X’ \rightarrow Y’)$, where $X’, Y’ \sim P_X P_Y$ represent independent variables. While VI-Growth effectively counters over-estimation, more complex predictive families may lead to under-estimation. To address this, we construct a sequence of predictive families $\mathcal{V}_1, \mathcal{V}_2, \ldots, \mathcal{V}$ of increasing complexity and compute the maximum of VI-Growth across these families, yielding the ordered VI-Growth (O-VIG). We provide theoretical results that justify this approach, showing that O-VIG is a provably tighter lower bound for the true $\mathcal{V}$-Information than empirical $\mathcal{V}$-Information itself, and exhibits stronger convergence properties than $\mathcal{V}$-Information. Empirically, O-VIG alleviates bias and consistently outperforms state-of-the-art methods in both MI estimation and dataset complexity estimation, demonstrating its practical utility.

Cite this Paper

BibTeX

@InProceedings{pmlr-v258-ghosh25a,
  title = 	 {Ordered $\mathcal{V}$-information Growth: A Fresh Perspective on Shared Information},
  author =       {Ghosh, Rohan and Motani, Mehul},
  booktitle = 	 {Proceedings of The 28th International Conference on Artificial Intelligence and Statistics},
  pages = 	 {793--801},
  year = 	 {2025},
  editor = 	 {Li, Yingzhen and Mandt, Stephan and Agrawal, Shipra and Khan, Emtiyaz},
  volume = 	 {258},
  series = 	 {Proceedings of Machine Learning Research},
  month = 	 {03--05 May},
  publisher =    {PMLR},
  pdf = 	 {https://raw.githubusercontent.com/mlresearch/v258/main/assets/ghosh25a/ghosh25a.pdf},
  url = 	 {https://proceedings.mlr.press/v258/ghosh25a.html},
  abstract = 	 {Mutual information (MI) is widely employed as a measure of shared information between random variables. However, MI assumes unbounded computational resources—a condition rarely met in practice, where predicting a random variable $Y$ from $X$ must rely on finite resources.  $\mathcal{V}$-information addresses this limitation by employing a predictive family $\mathcal{V}$ to emulate computational constraints, yielding a directed measure of shared information. Focusing on the mixed setting (continuous $X$ and discrete $Y$), here we highlight the upward bias of empirical $\mathcal{V}$-information, $\hat I_{\mathcal{V}}(X \rightarrow Y)$, even when $\mathcal{V}$ is low-complexity (e.g., shallow neural networks). To mitigate this bias, we introduce $\mathcal{V}$-Information Growth (VI-Growth), defined as $\\hat I_{\mathcal{V}}(X \rightarrow Y) - \hat I_{\mathcal{V}}(X’ \rightarrow Y’)$, where $X’, Y’ \sim P_X P_Y$ represent independent variables. While VI-Growth effectively counters over-estimation, more complex predictive families may lead to under-estimation. To address this, we construct a sequence of predictive families $\mathcal{V}_1, \mathcal{V}_2, \ldots, \mathcal{V}$ of increasing complexity and compute the maximum of VI-Growth across these families, yielding the ordered VI-Growth (O-VIG). We provide theoretical results that justify this approach, showing that O-VIG is a provably tighter lower bound for the true $\mathcal{V}$-Information than empirical $\mathcal{V}$-Information itself, and exhibits stronger convergence properties than $\mathcal{V}$-Information. Empirically, O-VIG alleviates bias and consistently outperforms state-of-the-art methods in both MI estimation and dataset complexity estimation, demonstrating its practical utility.}
}

Endnote

%0 Conference Paper
%T Ordered $\mathcalV$-information Growth: A Fresh Perspective on Shared Information
%A Rohan Ghosh
%A Mehul Motani
%B Proceedings of The 28th International Conference on Artificial Intelligence and Statistics
%C Proceedings of Machine Learning Research
%D 2025
%E Yingzhen Li
%E Stephan Mandt
%E Shipra Agrawal
%E Emtiyaz Khan	
%F pmlr-v258-ghosh25a
%I PMLR
%P 793--801
%U https://proceedings.mlr.press/v258/ghosh25a.html
%V 258
%X Mutual information (MI) is widely employed as a measure of shared information between random variables. However, MI assumes unbounded computational resources—a condition rarely met in practice, where predicting a random variable $Y$ from $X$ must rely on finite resources.  $\mathcal{V}$-information addresses this limitation by employing a predictive family $\mathcal{V}$ to emulate computational constraints, yielding a directed measure of shared information. Focusing on the mixed setting (continuous $X$ and discrete $Y$), here we highlight the upward bias of empirical $\mathcal{V}$-information, $\hat I_{\mathcal{V}}(X \rightarrow Y)$, even when $\mathcal{V}$ is low-complexity (e.g., shallow neural networks). To mitigate this bias, we introduce $\mathcal{V}$-Information Growth (VI-Growth), defined as $\\hat I_{\mathcal{V}}(X \rightarrow Y) - \hat I_{\mathcal{V}}(X’ \rightarrow Y’)$, where $X’, Y’ \sim P_X P_Y$ represent independent variables. While VI-Growth effectively counters over-estimation, more complex predictive families may lead to under-estimation. To address this, we construct a sequence of predictive families $\mathcal{V}_1, \mathcal{V}_2, \ldots, \mathcal{V}$ of increasing complexity and compute the maximum of VI-Growth across these families, yielding the ordered VI-Growth (O-VIG). We provide theoretical results that justify this approach, showing that O-VIG is a provably tighter lower bound for the true $\mathcal{V}$-Information than empirical $\mathcal{V}$-Information itself, and exhibits stronger convergence properties than $\mathcal{V}$-Information. Empirically, O-VIG alleviates bias and consistently outperforms state-of-the-art methods in both MI estimation and dataset complexity estimation, demonstrating its practical utility.

APA

Ghosh, R. & Motani, M.. (2025). Ordered $\mathcalV$-information Growth: A Fresh Perspective on Shared Information. Proceedings of The 28th International Conference on Artificial Intelligence and Statistics, in Proceedings of Machine Learning Research 258:793-801 Available from https://proceedings.mlr.press/v258/ghosh25a.html.

Ordered $\mathcalV$-information Growth: A Fresh Perspective on Shared Information

Abstract

Cite this Paper

Related Material