Sharper Bounds for Chebyshev Moment Matching, with Applications

Cameron Musco, Christopher Musco, Lucas Rosenblatt, Apoorv Vikram Singh
Proceedings of Thirty Eighth Conference on Learning Theory, PMLR 291:4309-4358, 2025.

Abstract

We study the problem of approximately recovering a probability distribution given noisy measurements of its Chebyshev polynomial moments. This problem arises broadly across algorithms, statistics, and machine learning. By leveraging a \emph{global decay bound} on the coefficients in the Chebyshev expansion of any Lipschitz function, we sharpen prior work, proving that accurate recovery in the Wasserstein distance is possible with more noise than previously known. Our result immediately yields a number of applications: (1) We give a simple “linear query” algorithm for constructing a differentially private synthetic data distribution with Wasserstein-$1$ error $\tilde{O}(1/n)$ based on a dataset of $n$ points in $[-1,1]$. This bound is optimal up to log factors and matches a recent breakthrough of Boedihardjo, Strohmer, and Vershynin [Probab. Theory. Rel., 2024], which uses a more complex “superregular random walk” method to beat an $O(1/\sqrt{n})$ accuracy barrier inherent to earlier approaches. (2) We give an $\tilde{O}(n^2/\epsilon)$ time algorithm for the linear algebraic problem of estimating the spectral density of an $n\times n$ symmetric matrix up to $\epsilon$ error in the Wasserstein distance. Our result accelerates prior methods from Chen et al. [ICML 2021] and Braverman et al. [STOC 2022]. (3) We tighten an analysis of Vinayak, Kong, Valiant, and Kakade [ICML 2019] on the maximum likelihood estimator for the statistical problem of “Learning Populations of Parameters”, extending the parameter regime in which sample optimal results can be obtained. Beyond these main results, we provide an extension of our bound to estimating distributions in $d > 1$ dimensions. We hope that these bounds will find applications more broadly to problems involving distribution recovery from noisy moment information.

Cite this Paper


BibTeX
@InProceedings{pmlr-v291-musco25a, title = {Sharper Bounds for Chebyshev Moment Matching, with Applications}, author = {Musco, Cameron and Musco, Christopher and Rosenblatt, Lucas and Singh, Apoorv Vikram}, booktitle = {Proceedings of Thirty Eighth Conference on Learning Theory}, pages = {4309--4358}, year = {2025}, editor = {Haghtalab, Nika and Moitra, Ankur}, volume = {291}, series = {Proceedings of Machine Learning Research}, month = {30 Jun--04 Jul}, publisher = {PMLR}, pdf = {https://raw.githubusercontent.com/mlresearch/v291/main/assets/musco25a/musco25a.pdf}, url = {https://proceedings.mlr.press/v291/musco25a.html}, abstract = {We study the problem of approximately recovering a probability distribution given noisy measurements of its Chebyshev polynomial moments. This problem arises broadly across algorithms, statistics, and machine learning. By leveraging a \emph{global decay bound} on the coefficients in the Chebyshev expansion of any Lipschitz function, we sharpen prior work, proving that accurate recovery in the Wasserstein distance is possible with more noise than previously known. Our result immediately yields a number of applications: (1) We give a simple “linear query” algorithm for constructing a differentially private synthetic data distribution with Wasserstein-$1$ error $\tilde{O}(1/n)$ based on a dataset of $n$ points in $[-1,1]$. This bound is optimal up to log factors and matches a recent breakthrough of Boedihardjo, Strohmer, and Vershynin [Probab. Theory. Rel., 2024], which uses a more complex “superregular random walk” method to beat an $O(1/\sqrt{n})$ accuracy barrier inherent to earlier approaches. (2) We give an $\tilde{O}(n^2/\epsilon)$ time algorithm for the linear algebraic problem of estimating the spectral density of an $n\times n$ symmetric matrix up to $\epsilon$ error in the Wasserstein distance. Our result accelerates prior methods from Chen et al. [ICML 2021] and Braverman et al. [STOC 2022]. (3) We tighten an analysis of Vinayak, Kong, Valiant, and Kakade [ICML 2019] on the maximum likelihood estimator for the statistical problem of “Learning Populations of Parameters”, extending the parameter regime in which sample optimal results can be obtained. Beyond these main results, we provide an extension of our bound to estimating distributions in $d > 1$ dimensions. We hope that these bounds will find applications more broadly to problems involving distribution recovery from noisy moment information.} }
Endnote
%0 Conference Paper %T Sharper Bounds for Chebyshev Moment Matching, with Applications %A Cameron Musco %A Christopher Musco %A Lucas Rosenblatt %A Apoorv Vikram Singh %B Proceedings of Thirty Eighth Conference on Learning Theory %C Proceedings of Machine Learning Research %D 2025 %E Nika Haghtalab %E Ankur Moitra %F pmlr-v291-musco25a %I PMLR %P 4309--4358 %U https://proceedings.mlr.press/v291/musco25a.html %V 291 %X We study the problem of approximately recovering a probability distribution given noisy measurements of its Chebyshev polynomial moments. This problem arises broadly across algorithms, statistics, and machine learning. By leveraging a \emph{global decay bound} on the coefficients in the Chebyshev expansion of any Lipschitz function, we sharpen prior work, proving that accurate recovery in the Wasserstein distance is possible with more noise than previously known. Our result immediately yields a number of applications: (1) We give a simple “linear query” algorithm for constructing a differentially private synthetic data distribution with Wasserstein-$1$ error $\tilde{O}(1/n)$ based on a dataset of $n$ points in $[-1,1]$. This bound is optimal up to log factors and matches a recent breakthrough of Boedihardjo, Strohmer, and Vershynin [Probab. Theory. Rel., 2024], which uses a more complex “superregular random walk” method to beat an $O(1/\sqrt{n})$ accuracy barrier inherent to earlier approaches. (2) We give an $\tilde{O}(n^2/\epsilon)$ time algorithm for the linear algebraic problem of estimating the spectral density of an $n\times n$ symmetric matrix up to $\epsilon$ error in the Wasserstein distance. Our result accelerates prior methods from Chen et al. [ICML 2021] and Braverman et al. [STOC 2022]. (3) We tighten an analysis of Vinayak, Kong, Valiant, and Kakade [ICML 2019] on the maximum likelihood estimator for the statistical problem of “Learning Populations of Parameters”, extending the parameter regime in which sample optimal results can be obtained. Beyond these main results, we provide an extension of our bound to estimating distributions in $d > 1$ dimensions. We hope that these bounds will find applications more broadly to problems involving distribution recovery from noisy moment information.
APA
Musco, C., Musco, C., Rosenblatt, L. & Singh, A.V.. (2025). Sharper Bounds for Chebyshev Moment Matching, with Applications. Proceedings of Thirty Eighth Conference on Learning Theory, in Proceedings of Machine Learning Research 291:4309-4358 Available from https://proceedings.mlr.press/v291/musco25a.html.

Related Material