Optimal Submanifold Structure in Log-linear Models

Zhou Derun, Mahito Sugiyama
Proceedings of the Forty-first Conference on Uncertainty in Artificial Intelligence, PMLR 286:915-932, 2025.

Abstract

In the modeling of discrete distributions using log-linear models, the model selection process is equivalent to imposing zero-value constraints on a subset of natural parameters, which is an established concept in information geometry. This zero-value constraint has been implicitly employed, from classic Boltzmann machines to recent many-body approximations of tensors. However, in theory, any constant value other than zero can be used for these constraints, leading to different submanifolds onto which the empirical distribution is projected, a possibility that has not been explored. Here, we investigate the asymptotic behavior of these constraint values from the perspective of information geometry. Specifically, we prove that the optimal value converges to zero as the size of the support of the empirical distribution increases, which corresponds to the size of the input tensors in the context of tensor decomposition. While our primary focus is on many-body approximation of tensors, it is straightforward to extend this analysis to a wide range of log-linear modeling applications.

Cite this Paper


BibTeX
@InProceedings{pmlr-v286-derun25a, title = {Optimal Submanifold Structure in Log-linear Models}, author = {Derun, Zhou and Sugiyama, Mahito}, booktitle = {Proceedings of the Forty-first Conference on Uncertainty in Artificial Intelligence}, pages = {915--932}, year = {2025}, editor = {Chiappa, Silvia and Magliacane, Sara}, volume = {286}, series = {Proceedings of Machine Learning Research}, month = {21--25 Jul}, publisher = {PMLR}, pdf = {https://raw.githubusercontent.com/mlresearch/v286/main/assets/derun25a/derun25a.pdf}, url = {https://proceedings.mlr.press/v286/derun25a.html}, abstract = {In the modeling of discrete distributions using log-linear models, the model selection process is equivalent to imposing zero-value constraints on a subset of natural parameters, which is an established concept in information geometry. This zero-value constraint has been implicitly employed, from classic Boltzmann machines to recent many-body approximations of tensors. However, in theory, any constant value other than zero can be used for these constraints, leading to different submanifolds onto which the empirical distribution is projected, a possibility that has not been explored. Here, we investigate the asymptotic behavior of these constraint values from the perspective of information geometry. Specifically, we prove that the optimal value converges to zero as the size of the support of the empirical distribution increases, which corresponds to the size of the input tensors in the context of tensor decomposition. While our primary focus is on many-body approximation of tensors, it is straightforward to extend this analysis to a wide range of log-linear modeling applications.} }
Endnote
%0 Conference Paper %T Optimal Submanifold Structure in Log-linear Models %A Zhou Derun %A Mahito Sugiyama %B Proceedings of the Forty-first Conference on Uncertainty in Artificial Intelligence %C Proceedings of Machine Learning Research %D 2025 %E Silvia Chiappa %E Sara Magliacane %F pmlr-v286-derun25a %I PMLR %P 915--932 %U https://proceedings.mlr.press/v286/derun25a.html %V 286 %X In the modeling of discrete distributions using log-linear models, the model selection process is equivalent to imposing zero-value constraints on a subset of natural parameters, which is an established concept in information geometry. This zero-value constraint has been implicitly employed, from classic Boltzmann machines to recent many-body approximations of tensors. However, in theory, any constant value other than zero can be used for these constraints, leading to different submanifolds onto which the empirical distribution is projected, a possibility that has not been explored. Here, we investigate the asymptotic behavior of these constraint values from the perspective of information geometry. Specifically, we prove that the optimal value converges to zero as the size of the support of the empirical distribution increases, which corresponds to the size of the input tensors in the context of tensor decomposition. While our primary focus is on many-body approximation of tensors, it is straightforward to extend this analysis to a wide range of log-linear modeling applications.
APA
Derun, Z. & Sugiyama, M.. (2025). Optimal Submanifold Structure in Log-linear Models. Proceedings of the Forty-first Conference on Uncertainty in Artificial Intelligence, in Proceedings of Machine Learning Research 286:915-932 Available from https://proceedings.mlr.press/v286/derun25a.html.

Related Material