Unconditionally Calibrated Priors for Beta Mixture Density Networks

Alix Lhéritier, Maurizio Filippone
Proceedings of The 28th International Conference on Artificial Intelligence and Statistics, PMLR 258:4933-4941, 2025.

Abstract

Mixture Density Networks (MDNs) allow to model arbitrarily complex mappings between inputs and mixture densities, enabling flexible conditional density estimation, at the risk of severe overfitting. A Bayesian approach can alleviate this problem by specifying a prior over the parameters of the neural network. However, these priors can be difficult to specify due to the lack of interpretability. We propose a novel neural network construction for conditional mixture densities that allows one to specify the prior in the predictive distribution domain. The construction is based on mapping the targets to the unit hypercube via a diffeomorphism, enabling the use of mixtures of Beta distributions. We prove that the prior predictive distributions are calibrated in the sense that they are equal to the unconditional density function defined by the diffeomorphism. Contrary to Bayesian Gaussian MDNs, which exhibit tied functional and distributional complexity, we show that our construction allows to decouple them. We propose an extension allowing to model correlations in the covariates via Gaussian copulas, potentially reducing the necessary number of mixture components. Our experiments show competitive performance on standard benchmarks with respect to the state of the art.

Cite this Paper


BibTeX
@InProceedings{pmlr-v258-lheritier25a, title = {Unconditionally Calibrated Priors for Beta Mixture Density Networks}, author = {Lh{\'e}ritier, Alix and Filippone, Maurizio}, booktitle = {Proceedings of The 28th International Conference on Artificial Intelligence and Statistics}, pages = {4933--4941}, year = {2025}, editor = {Li, Yingzhen and Mandt, Stephan and Agrawal, Shipra and Khan, Emtiyaz}, volume = {258}, series = {Proceedings of Machine Learning Research}, month = {03--05 May}, publisher = {PMLR}, pdf = {https://raw.githubusercontent.com/mlresearch/v258/main/assets/lheritier25a/lheritier25a.pdf}, url = {https://proceedings.mlr.press/v258/lheritier25a.html}, abstract = {Mixture Density Networks (MDNs) allow to model arbitrarily complex mappings between inputs and mixture densities, enabling flexible conditional density estimation, at the risk of severe overfitting. A Bayesian approach can alleviate this problem by specifying a prior over the parameters of the neural network. However, these priors can be difficult to specify due to the lack of interpretability. We propose a novel neural network construction for conditional mixture densities that allows one to specify the prior in the predictive distribution domain. The construction is based on mapping the targets to the unit hypercube via a diffeomorphism, enabling the use of mixtures of Beta distributions. We prove that the prior predictive distributions are calibrated in the sense that they are equal to the unconditional density function defined by the diffeomorphism. Contrary to Bayesian Gaussian MDNs, which exhibit tied functional and distributional complexity, we show that our construction allows to decouple them. We propose an extension allowing to model correlations in the covariates via Gaussian copulas, potentially reducing the necessary number of mixture components. Our experiments show competitive performance on standard benchmarks with respect to the state of the art.} }
Endnote
%0 Conference Paper %T Unconditionally Calibrated Priors for Beta Mixture Density Networks %A Alix Lhéritier %A Maurizio Filippone %B Proceedings of The 28th International Conference on Artificial Intelligence and Statistics %C Proceedings of Machine Learning Research %D 2025 %E Yingzhen Li %E Stephan Mandt %E Shipra Agrawal %E Emtiyaz Khan %F pmlr-v258-lheritier25a %I PMLR %P 4933--4941 %U https://proceedings.mlr.press/v258/lheritier25a.html %V 258 %X Mixture Density Networks (MDNs) allow to model arbitrarily complex mappings between inputs and mixture densities, enabling flexible conditional density estimation, at the risk of severe overfitting. A Bayesian approach can alleviate this problem by specifying a prior over the parameters of the neural network. However, these priors can be difficult to specify due to the lack of interpretability. We propose a novel neural network construction for conditional mixture densities that allows one to specify the prior in the predictive distribution domain. The construction is based on mapping the targets to the unit hypercube via a diffeomorphism, enabling the use of mixtures of Beta distributions. We prove that the prior predictive distributions are calibrated in the sense that they are equal to the unconditional density function defined by the diffeomorphism. Contrary to Bayesian Gaussian MDNs, which exhibit tied functional and distributional complexity, we show that our construction allows to decouple them. We propose an extension allowing to model correlations in the covariates via Gaussian copulas, potentially reducing the necessary number of mixture components. Our experiments show competitive performance on standard benchmarks with respect to the state of the art.
APA
Lhéritier, A. & Filippone, M.. (2025). Unconditionally Calibrated Priors for Beta Mixture Density Networks. Proceedings of The 28th International Conference on Artificial Intelligence and Statistics, in Proceedings of Machine Learning Research 258:4933-4941 Available from https://proceedings.mlr.press/v258/lheritier25a.html.

Related Material