Leverage Score Sampling for Faster Accelerated Regression and ERM

Naman Agarwal, Sham Kakade, Rahul Kidambi, Yin-Tat Lee, Praneeth Netrapalli, Aaron Sidford
Proceedings of the 31st International Conference on Algorithmic Learning Theory, PMLR 117:22-47, 2020.

Abstract

Given a matrix $\mathbf{A}\in\R^{n\times d}$ and a vector $b\in\R^{d}$, we show how to compute an $\epsilon$-approximate solution to the regression problem $ \min_{x\in\R^{d}}\frac{1}{2} \norm{\mathbf{A} x-b}_{2}^{2} $ in time $ \widetilde{O} ((n+\sqrt{d\cdot\kappa_{\text{sum}}}) s \log\epsilon^{-1}) $ where $\kappa_{\text{sum}}=\tr\left(\mathbf{A}^{\top}\mathbf{A}\right)/\lambda_{\min}(\mathbf{A}^{\top}\mathbf{A})$ and $s$ is the maximum number of non-zero entries in a row of $\mathbf{A}$. This improves upon the previous best running time of $ \widetilde{O} ((n+\sqrt{n \cdot\kappa_{\text{sum}}}) s \log\epsilon^{-1})$. We achieve our result through an interesting combination of leverage score sampling, proximal point methods, and accelerated coordinate descent methods. Further, we show that our method not only matches the performance of previous methods up to polylogarithmic factors, but further improves whenever leverage scores of rows are small. We also provide a non-linear generalization of these results that improves the running time for solving a broader class of ERM problems and expands the set of ERM problems provably solvable in nearly linear time.

Cite this Paper


BibTeX
@InProceedings{pmlr-v117-agarwal20a, title = {Leverage Score Sampling for Faster Accelerated Regression and ERM}, author = {Agarwal, Naman and Kakade, Sham and Kidambi, Rahul and Lee, Yin-Tat and Netrapalli, Praneeth and Sidford, Aaron}, booktitle = {Proceedings of the 31st International Conference on Algorithmic Learning Theory}, pages = {22--47}, year = {2020}, editor = {Kontorovich, Aryeh and Neu, Gergely}, volume = {117}, series = {Proceedings of Machine Learning Research}, month = {08 Feb--11 Feb}, publisher = {PMLR}, pdf = {http://proceedings.mlr.press/v117/agarwal20a/agarwal20a.pdf}, url = {https://proceedings.mlr.press/v117/agarwal20a.html}, abstract = {Given a matrix $\mathbf{A}\in\R^{n\times d}$ and a vector $b\in\R^{d}$, we show how to compute an $\epsilon$-approximate solution to the regression problem $ \min_{x\in\R^{d}}\frac{1}{2} \norm{\mathbf{A} x-b}_{2}^{2} $ in time $ \widetilde{O} ((n+\sqrt{d\cdot\kappa_{\text{sum}}}) s \log\epsilon^{-1}) $ where $\kappa_{\text{sum}}=\tr\left(\mathbf{A}^{\top}\mathbf{A}\right)/\lambda_{\min}(\mathbf{A}^{\top}\mathbf{A})$ and $s$ is the maximum number of non-zero entries in a row of $\mathbf{A}$. This improves upon the previous best running time of $ \widetilde{O} ((n+\sqrt{n \cdot\kappa_{\text{sum}}}) s \log\epsilon^{-1})$. We achieve our result through an interesting combination of leverage score sampling, proximal point methods, and accelerated coordinate descent methods. Further, we show that our method not only matches the performance of previous methods up to polylogarithmic factors, but further improves whenever leverage scores of rows are small. We also provide a non-linear generalization of these results that improves the running time for solving a broader class of ERM problems and expands the set of ERM problems provably solvable in nearly linear time.} }
Endnote
%0 Conference Paper %T Leverage Score Sampling for Faster Accelerated Regression and ERM %A Naman Agarwal %A Sham Kakade %A Rahul Kidambi %A Yin-Tat Lee %A Praneeth Netrapalli %A Aaron Sidford %B Proceedings of the 31st International Conference on Algorithmic Learning Theory %C Proceedings of Machine Learning Research %D 2020 %E Aryeh Kontorovich %E Gergely Neu %F pmlr-v117-agarwal20a %I PMLR %P 22--47 %U https://proceedings.mlr.press/v117/agarwal20a.html %V 117 %X Given a matrix $\mathbf{A}\in\R^{n\times d}$ and a vector $b\in\R^{d}$, we show how to compute an $\epsilon$-approximate solution to the regression problem $ \min_{x\in\R^{d}}\frac{1}{2} \norm{\mathbf{A} x-b}_{2}^{2} $ in time $ \widetilde{O} ((n+\sqrt{d\cdot\kappa_{\text{sum}}}) s \log\epsilon^{-1}) $ where $\kappa_{\text{sum}}=\tr\left(\mathbf{A}^{\top}\mathbf{A}\right)/\lambda_{\min}(\mathbf{A}^{\top}\mathbf{A})$ and $s$ is the maximum number of non-zero entries in a row of $\mathbf{A}$. This improves upon the previous best running time of $ \widetilde{O} ((n+\sqrt{n \cdot\kappa_{\text{sum}}}) s \log\epsilon^{-1})$. We achieve our result through an interesting combination of leverage score sampling, proximal point methods, and accelerated coordinate descent methods. Further, we show that our method not only matches the performance of previous methods up to polylogarithmic factors, but further improves whenever leverage scores of rows are small. We also provide a non-linear generalization of these results that improves the running time for solving a broader class of ERM problems and expands the set of ERM problems provably solvable in nearly linear time.
APA
Agarwal, N., Kakade, S., Kidambi, R., Lee, Y., Netrapalli, P. & Sidford, A.. (2020). Leverage Score Sampling for Faster Accelerated Regression and ERM. Proceedings of the 31st International Conference on Algorithmic Learning Theory, in Proceedings of Machine Learning Research 117:22-47 Available from https://proceedings.mlr.press/v117/agarwal20a.html.

Related Material