Efficient Estimation of Mutual Information for Strongly Dependent Variables

Shuyang Gao, Greg Ver Steeg, Aram Galstyan
Proceedings of the Eighteenth International Conference on Artificial Intelligence and Statistics, PMLR 38:277-286, 2015.

Abstract

We demonstrate that a popular class of non-parametric mutual information (MI) estimators based on k-nearest-neighbor graphs requires number of samples that scales exponentially with the true MI. Consequently, accurate estimation of MI between two strongly dependent variables is possible only for prohibitively large sample size. This important yet overlooked shortcoming of the existing estimators is due to their implicit reliance on local uniformity of the underlying joint distribution. We introduce a new estimator that is robust to local non-uniformity, works well with limited data, and is able to capture relationship strengths over many orders of magnitude. We demonstrate the superior performance of the proposed estimator on both synthetic and real-world data.

Cite this Paper


BibTeX
@InProceedings{pmlr-v38-gao15, title = {{Efficient Estimation of Mutual Information for Strongly Dependent Variables}}, author = {Gao, Shuyang and Ver Steeg, Greg and Galstyan, Aram}, booktitle = {Proceedings of the Eighteenth International Conference on Artificial Intelligence and Statistics}, pages = {277--286}, year = {2015}, editor = {Lebanon, Guy and Vishwanathan, S. V. N.}, volume = {38}, series = {Proceedings of Machine Learning Research}, address = {San Diego, California, USA}, month = {09--12 May}, publisher = {PMLR}, pdf = {http://proceedings.mlr.press/v38/gao15.pdf}, url = {https://proceedings.mlr.press/v38/gao15.html}, abstract = {We demonstrate that a popular class of non-parametric mutual information (MI) estimators based on k-nearest-neighbor graphs requires number of samples that scales exponentially with the true MI. Consequently, accurate estimation of MI between two strongly dependent variables is possible only for prohibitively large sample size. This important yet overlooked shortcoming of the existing estimators is due to their implicit reliance on local uniformity of the underlying joint distribution. We introduce a new estimator that is robust to local non-uniformity, works well with limited data, and is able to capture relationship strengths over many orders of magnitude. We demonstrate the superior performance of the proposed estimator on both synthetic and real-world data.} }
Endnote
%0 Conference Paper %T Efficient Estimation of Mutual Information for Strongly Dependent Variables %A Shuyang Gao %A Greg Ver Steeg %A Aram Galstyan %B Proceedings of the Eighteenth International Conference on Artificial Intelligence and Statistics %C Proceedings of Machine Learning Research %D 2015 %E Guy Lebanon %E S. V. N. Vishwanathan %F pmlr-v38-gao15 %I PMLR %P 277--286 %U https://proceedings.mlr.press/v38/gao15.html %V 38 %X We demonstrate that a popular class of non-parametric mutual information (MI) estimators based on k-nearest-neighbor graphs requires number of samples that scales exponentially with the true MI. Consequently, accurate estimation of MI between two strongly dependent variables is possible only for prohibitively large sample size. This important yet overlooked shortcoming of the existing estimators is due to their implicit reliance on local uniformity of the underlying joint distribution. We introduce a new estimator that is robust to local non-uniformity, works well with limited data, and is able to capture relationship strengths over many orders of magnitude. We demonstrate the superior performance of the proposed estimator on both synthetic and real-world data.
RIS
TY - CPAPER TI - Efficient Estimation of Mutual Information for Strongly Dependent Variables AU - Shuyang Gao AU - Greg Ver Steeg AU - Aram Galstyan BT - Proceedings of the Eighteenth International Conference on Artificial Intelligence and Statistics DA - 2015/02/21 ED - Guy Lebanon ED - S. V. N. Vishwanathan ID - pmlr-v38-gao15 PB - PMLR DP - Proceedings of Machine Learning Research VL - 38 SP - 277 EP - 286 L1 - http://proceedings.mlr.press/v38/gao15.pdf UR - https://proceedings.mlr.press/v38/gao15.html AB - We demonstrate that a popular class of non-parametric mutual information (MI) estimators based on k-nearest-neighbor graphs requires number of samples that scales exponentially with the true MI. Consequently, accurate estimation of MI between two strongly dependent variables is possible only for prohibitively large sample size. This important yet overlooked shortcoming of the existing estimators is due to their implicit reliance on local uniformity of the underlying joint distribution. We introduce a new estimator that is robust to local non-uniformity, works well with limited data, and is able to capture relationship strengths over many orders of magnitude. We demonstrate the superior performance of the proposed estimator on both synthetic and real-world data. ER -
APA
Gao, S., Ver Steeg, G. & Galstyan, A.. (2015). Efficient Estimation of Mutual Information for Strongly Dependent Variables. Proceedings of the Eighteenth International Conference on Artificial Intelligence and Statistics, in Proceedings of Machine Learning Research 38:277-286 Available from https://proceedings.mlr.press/v38/gao15.html.

Related Material