Estimators of Entropy and Information via Inference in Probabilistic Models

Feras Saad, Marco Cusumano-Towner, Vikash Mansinghka
Proceedings of The 25th International Conference on Artificial Intelligence and Statistics, PMLR 151:5604-5621, 2022.

Abstract

Estimating information-theoretic quantities such as entropy and mutual information is central to many problems in statistics and machine learning, but challenging in high dimensions. This paper presents estimators of entropy via inference (EEVI), which deliver upper and lower bounds on many information quantities for arbitrary variables in a probabilistic generative model. These estimators use importance sampling with proposal distribution families that include amortized variational inference and sequential Monte Carlo, which can be tailored to the target model and used to squeeze true information values with high accuracy. We present several theoretical properties of EEVI and demonstrate scalability and efficacy on two problems from the medical domain: (i) in an expert system for diagnosing liver disorders, we rank medical tests according to how informative they are about latent diseases, given a pattern of observed symptoms and patient attributes; and (ii) in a differential equation model of carbohydrate metabolism, we find optimal times to take blood glucose measurements that maximize information about a diabetic patient’s insulin sensitivity, given their meal and medication schedule.

Cite this Paper


BibTeX
@InProceedings{pmlr-v151-saad22a, title = { Estimators of Entropy and Information via Inference in Probabilistic Models }, author = {Saad, Feras and Cusumano-Towner, Marco and Mansinghka, Vikash}, booktitle = {Proceedings of The 25th International Conference on Artificial Intelligence and Statistics}, pages = {5604--5621}, year = {2022}, editor = {Camps-Valls, Gustau and Ruiz, Francisco J. R. and Valera, Isabel}, volume = {151}, series = {Proceedings of Machine Learning Research}, month = {28--30 Mar}, publisher = {PMLR}, pdf = {https://proceedings.mlr.press/v151/saad22a/saad22a.pdf}, url = {https://proceedings.mlr.press/v151/saad22a.html}, abstract = { Estimating information-theoretic quantities such as entropy and mutual information is central to many problems in statistics and machine learning, but challenging in high dimensions. This paper presents estimators of entropy via inference (EEVI), which deliver upper and lower bounds on many information quantities for arbitrary variables in a probabilistic generative model. These estimators use importance sampling with proposal distribution families that include amortized variational inference and sequential Monte Carlo, which can be tailored to the target model and used to squeeze true information values with high accuracy. We present several theoretical properties of EEVI and demonstrate scalability and efficacy on two problems from the medical domain: (i) in an expert system for diagnosing liver disorders, we rank medical tests according to how informative they are about latent diseases, given a pattern of observed symptoms and patient attributes; and (ii) in a differential equation model of carbohydrate metabolism, we find optimal times to take blood glucose measurements that maximize information about a diabetic patient’s insulin sensitivity, given their meal and medication schedule. } }
Endnote
%0 Conference Paper %T Estimators of Entropy and Information via Inference in Probabilistic Models %A Feras Saad %A Marco Cusumano-Towner %A Vikash Mansinghka %B Proceedings of The 25th International Conference on Artificial Intelligence and Statistics %C Proceedings of Machine Learning Research %D 2022 %E Gustau Camps-Valls %E Francisco J. R. Ruiz %E Isabel Valera %F pmlr-v151-saad22a %I PMLR %P 5604--5621 %U https://proceedings.mlr.press/v151/saad22a.html %V 151 %X Estimating information-theoretic quantities such as entropy and mutual information is central to many problems in statistics and machine learning, but challenging in high dimensions. This paper presents estimators of entropy via inference (EEVI), which deliver upper and lower bounds on many information quantities for arbitrary variables in a probabilistic generative model. These estimators use importance sampling with proposal distribution families that include amortized variational inference and sequential Monte Carlo, which can be tailored to the target model and used to squeeze true information values with high accuracy. We present several theoretical properties of EEVI and demonstrate scalability and efficacy on two problems from the medical domain: (i) in an expert system for diagnosing liver disorders, we rank medical tests according to how informative they are about latent diseases, given a pattern of observed symptoms and patient attributes; and (ii) in a differential equation model of carbohydrate metabolism, we find optimal times to take blood glucose measurements that maximize information about a diabetic patient’s insulin sensitivity, given their meal and medication schedule.
APA
Saad, F., Cusumano-Towner, M. & Mansinghka, V.. (2022). Estimators of Entropy and Information via Inference in Probabilistic Models . Proceedings of The 25th International Conference on Artificial Intelligence and Statistics, in Proceedings of Machine Learning Research 151:5604-5621 Available from https://proceedings.mlr.press/v151/saad22a.html.

Related Material