Gradient Compressed Sensing: A Query-Efficient Gradient Estimator for High-Dimensional Zeroth-Order Optimization

Ruizhong Qiu, Hanghang Tong
Proceedings of the 41st International Conference on Machine Learning, PMLR 235:41717-41748, 2024.

Abstract

We study nonconvex zeroth-order optimization (ZOO) in a high-dimensional space $\mathbb R^d$ for functions with approximately $s$-sparse gradients. To reduce the dependence on the dimensionality $d$ in the query complexity, high-dimensional ZOO methods seek to leverage gradient sparsity to design gradient estimators. The previous best method needs $O\big(s\log\frac ds\big)$ queries per step to achieve $O\big(\frac1T\big)$ rate of convergence w.r.t. the number T of steps. In this paper, we propose Gradient Compressed Sensing (GraCe), a query-efficient and accurate estimator for sparse gradients that uses only $O\big(s\log\log\frac ds\big)$ queries per step and still achieves $O\big(\frac1T\big)$ rate of convergence. To our best knowledge, we are the first to achieve a double-logarithmic dependence on $d$ in the query complexity under weaker assumptions. Our proposed GraCe generalizes the Indyk–Price–Woodruff (IPW) algorithm in compressed sensing from linear measurements to nonlinear functions. Furthermore, since the IPW algorithm is purely theoretical due to its impractically large constant, we improve the IPW algorithm via our dependent random partition technique together with our corresponding novel analysis and successfully reduce the constant by a factor of nearly $4300$. Our GraCe is not only theoretically query-efficient but also achieves strong empirical performance. We benchmark our GraCe against $12$ existing ZOO methods with $10000$-dimensional functions and demonstrate that GraCe significantly outperforms existing methods. Our code is publicly available at https://github.com/q-rz/ICML24-GraCe.

Cite this Paper


BibTeX
@InProceedings{pmlr-v235-qiu24g, title = {Gradient Compressed Sensing: A Query-Efficient Gradient Estimator for High-Dimensional Zeroth-Order Optimization}, author = {Qiu, Ruizhong and Tong, Hanghang}, booktitle = {Proceedings of the 41st International Conference on Machine Learning}, pages = {41717--41748}, year = {2024}, editor = {Salakhutdinov, Ruslan and Kolter, Zico and Heller, Katherine and Weller, Adrian and Oliver, Nuria and Scarlett, Jonathan and Berkenkamp, Felix}, volume = {235}, series = {Proceedings of Machine Learning Research}, month = {21--27 Jul}, publisher = {PMLR}, pdf = {https://raw.githubusercontent.com/mlresearch/v235/main/assets/qiu24g/qiu24g.pdf}, url = {https://proceedings.mlr.press/v235/qiu24g.html}, abstract = {We study nonconvex zeroth-order optimization (ZOO) in a high-dimensional space $\mathbb R^d$ for functions with approximately $s$-sparse gradients. To reduce the dependence on the dimensionality $d$ in the query complexity, high-dimensional ZOO methods seek to leverage gradient sparsity to design gradient estimators. The previous best method needs $O\big(s\log\frac ds\big)$ queries per step to achieve $O\big(\frac1T\big)$ rate of convergence w.r.t. the number T of steps. In this paper, we propose Gradient Compressed Sensing (GraCe), a query-efficient and accurate estimator for sparse gradients that uses only $O\big(s\log\log\frac ds\big)$ queries per step and still achieves $O\big(\frac1T\big)$ rate of convergence. To our best knowledge, we are the first to achieve a double-logarithmic dependence on $d$ in the query complexity under weaker assumptions. Our proposed GraCe generalizes the Indyk–Price–Woodruff (IPW) algorithm in compressed sensing from linear measurements to nonlinear functions. Furthermore, since the IPW algorithm is purely theoretical due to its impractically large constant, we improve the IPW algorithm via our dependent random partition technique together with our corresponding novel analysis and successfully reduce the constant by a factor of nearly $4300$. Our GraCe is not only theoretically query-efficient but also achieves strong empirical performance. We benchmark our GraCe against $12$ existing ZOO methods with $10000$-dimensional functions and demonstrate that GraCe significantly outperforms existing methods. Our code is publicly available at https://github.com/q-rz/ICML24-GraCe.} }
Endnote
%0 Conference Paper %T Gradient Compressed Sensing: A Query-Efficient Gradient Estimator for High-Dimensional Zeroth-Order Optimization %A Ruizhong Qiu %A Hanghang Tong %B Proceedings of the 41st International Conference on Machine Learning %C Proceedings of Machine Learning Research %D 2024 %E Ruslan Salakhutdinov %E Zico Kolter %E Katherine Heller %E Adrian Weller %E Nuria Oliver %E Jonathan Scarlett %E Felix Berkenkamp %F pmlr-v235-qiu24g %I PMLR %P 41717--41748 %U https://proceedings.mlr.press/v235/qiu24g.html %V 235 %X We study nonconvex zeroth-order optimization (ZOO) in a high-dimensional space $\mathbb R^d$ for functions with approximately $s$-sparse gradients. To reduce the dependence on the dimensionality $d$ in the query complexity, high-dimensional ZOO methods seek to leverage gradient sparsity to design gradient estimators. The previous best method needs $O\big(s\log\frac ds\big)$ queries per step to achieve $O\big(\frac1T\big)$ rate of convergence w.r.t. the number T of steps. In this paper, we propose Gradient Compressed Sensing (GraCe), a query-efficient and accurate estimator for sparse gradients that uses only $O\big(s\log\log\frac ds\big)$ queries per step and still achieves $O\big(\frac1T\big)$ rate of convergence. To our best knowledge, we are the first to achieve a double-logarithmic dependence on $d$ in the query complexity under weaker assumptions. Our proposed GraCe generalizes the Indyk–Price–Woodruff (IPW) algorithm in compressed sensing from linear measurements to nonlinear functions. Furthermore, since the IPW algorithm is purely theoretical due to its impractically large constant, we improve the IPW algorithm via our dependent random partition technique together with our corresponding novel analysis and successfully reduce the constant by a factor of nearly $4300$. Our GraCe is not only theoretically query-efficient but also achieves strong empirical performance. We benchmark our GraCe against $12$ existing ZOO methods with $10000$-dimensional functions and demonstrate that GraCe significantly outperforms existing methods. Our code is publicly available at https://github.com/q-rz/ICML24-GraCe.
APA
Qiu, R. & Tong, H.. (2024). Gradient Compressed Sensing: A Query-Efficient Gradient Estimator for High-Dimensional Zeroth-Order Optimization. Proceedings of the 41st International Conference on Machine Learning, in Proceedings of Machine Learning Research 235:41717-41748 Available from https://proceedings.mlr.press/v235/qiu24g.html.

Related Material