Hypothesis testing with low-degree polynomials in the Morris class of exponential families

Dmitriy Kunisky
Proceedings of Thirty Fourth Conference on Learning Theory, PMLR 134:2822-2848, 2021.

Abstract

Analysis of low-degree polynomial algorithms is a powerful, newly-popular method for predicting computational thresholds in hypothesis testing problems. One limitation of current techniques for this analysis is their restriction to Bernoulli and Gaussian distributions. We expand this range of possibilities by performing the low-degree analysis of hypothesis testing for the Morris class of natural exponential families with quadratic variance function, giving a unified treatment of Gaussian, Poisson, gamma (including exponential and chi-squared), binomial (including Bernoulli), negative binomial (including geometric), and generalized hyperbolic secant distributions. We then give several algorithmic applications. 1. In models where a random signal is observed through coordinatewise-independent noise applied in an exponential family, the success or failure of low-degree polynomials is governed by the z-score overlap, the inner product of z-score vectors with respect to the null distribution of two independent copies of the signal. 2. In the same models, testing with low-degree polynomials exhibits channel monotonicity: the above distributions admit a total ordering by computational cost of hypothesis testing, according to a scalar parameter describing how the variance depends on the mean in an exponential family. 3. In a spiked matrix model with a particular non-Gaussian noise distribution, the low-degree prediction is incorrect unless polynomials with arbitrarily large degree in individual matrix entries are permitted. This shows that polynomials summing over self-avoiding walks and variants thereof, as proposed recently by Ding, Hopkins, and Steurer (2020) for spiked matrix models with heavy-tailed noise, are strictly suboptimal for this model. Thus low-degree polynomials appear to offer a tradeoff between robustness and strong performance fine-tuned to specific models. Inspired by this, we suggest that a class of problems requiring "exploration before inference," where an algorithm must first examine the input and then use some intermediate computation to choose a suitable inference subroutine, appears especially difficult for low-degree polynomials.

Cite this Paper


BibTeX
@InProceedings{pmlr-v134-kunisky21a, title = {Hypothesis testing with low-degree polynomials in the Morris class of exponential families}, author = {Kunisky, Dmitriy}, booktitle = {Proceedings of Thirty Fourth Conference on Learning Theory}, pages = {2822--2848}, year = {2021}, editor = {Belkin, Mikhail and Kpotufe, Samory}, volume = {134}, series = {Proceedings of Machine Learning Research}, month = {15--19 Aug}, publisher = {PMLR}, pdf = {http://proceedings.mlr.press/v134/kunisky21a/kunisky21a.pdf}, url = {https://proceedings.mlr.press/v134/kunisky21a.html}, abstract = {Analysis of low-degree polynomial algorithms is a powerful, newly-popular method for predicting computational thresholds in hypothesis testing problems. One limitation of current techniques for this analysis is their restriction to Bernoulli and Gaussian distributions. We expand this range of possibilities by performing the low-degree analysis of hypothesis testing for the Morris class of natural exponential families with quadratic variance function, giving a unified treatment of Gaussian, Poisson, gamma (including exponential and chi-squared), binomial (including Bernoulli), negative binomial (including geometric), and generalized hyperbolic secant distributions. We then give several algorithmic applications. 1. In models where a random signal is observed through coordinatewise-independent noise applied in an exponential family, the success or failure of low-degree polynomials is governed by the z-score overlap, the inner product of z-score vectors with respect to the null distribution of two independent copies of the signal. 2. In the same models, testing with low-degree polynomials exhibits channel monotonicity: the above distributions admit a total ordering by computational cost of hypothesis testing, according to a scalar parameter describing how the variance depends on the mean in an exponential family. 3. In a spiked matrix model with a particular non-Gaussian noise distribution, the low-degree prediction is incorrect unless polynomials with arbitrarily large degree in individual matrix entries are permitted. This shows that polynomials summing over self-avoiding walks and variants thereof, as proposed recently by Ding, Hopkins, and Steurer (2020) for spiked matrix models with heavy-tailed noise, are strictly suboptimal for this model. Thus low-degree polynomials appear to offer a tradeoff between robustness and strong performance fine-tuned to specific models. Inspired by this, we suggest that a class of problems requiring "exploration before inference," where an algorithm must first examine the input and then use some intermediate computation to choose a suitable inference subroutine, appears especially difficult for low-degree polynomials.} }
Endnote
%0 Conference Paper %T Hypothesis testing with low-degree polynomials in the Morris class of exponential families %A Dmitriy Kunisky %B Proceedings of Thirty Fourth Conference on Learning Theory %C Proceedings of Machine Learning Research %D 2021 %E Mikhail Belkin %E Samory Kpotufe %F pmlr-v134-kunisky21a %I PMLR %P 2822--2848 %U https://proceedings.mlr.press/v134/kunisky21a.html %V 134 %X Analysis of low-degree polynomial algorithms is a powerful, newly-popular method for predicting computational thresholds in hypothesis testing problems. One limitation of current techniques for this analysis is their restriction to Bernoulli and Gaussian distributions. We expand this range of possibilities by performing the low-degree analysis of hypothesis testing for the Morris class of natural exponential families with quadratic variance function, giving a unified treatment of Gaussian, Poisson, gamma (including exponential and chi-squared), binomial (including Bernoulli), negative binomial (including geometric), and generalized hyperbolic secant distributions. We then give several algorithmic applications. 1. In models where a random signal is observed through coordinatewise-independent noise applied in an exponential family, the success or failure of low-degree polynomials is governed by the z-score overlap, the inner product of z-score vectors with respect to the null distribution of two independent copies of the signal. 2. In the same models, testing with low-degree polynomials exhibits channel monotonicity: the above distributions admit a total ordering by computational cost of hypothesis testing, according to a scalar parameter describing how the variance depends on the mean in an exponential family. 3. In a spiked matrix model with a particular non-Gaussian noise distribution, the low-degree prediction is incorrect unless polynomials with arbitrarily large degree in individual matrix entries are permitted. This shows that polynomials summing over self-avoiding walks and variants thereof, as proposed recently by Ding, Hopkins, and Steurer (2020) for spiked matrix models with heavy-tailed noise, are strictly suboptimal for this model. Thus low-degree polynomials appear to offer a tradeoff between robustness and strong performance fine-tuned to specific models. Inspired by this, we suggest that a class of problems requiring "exploration before inference," where an algorithm must first examine the input and then use some intermediate computation to choose a suitable inference subroutine, appears especially difficult for low-degree polynomials.
APA
Kunisky, D.. (2021). Hypothesis testing with low-degree polynomials in the Morris class of exponential families. Proceedings of Thirty Fourth Conference on Learning Theory, in Proceedings of Machine Learning Research 134:2822-2848 Available from https://proceedings.mlr.press/v134/kunisky21a.html.

Related Material