On Fast Convergence of Proximal Algorithms for SQRT-Lasso Optimization: Don’t Worry About its Nonsmooth Loss Function

Xinguo Li, Haoming Jiang, Jarvis Haupt, Raman Arora, Han Liu, Mingyi Hong, Tuo Zhao
Proceedings of The 35th Uncertainty in Artificial Intelligence Conference, PMLR 115:49-59, 2020.

Abstract

Many machine learning techniques sacrifice convenient computational structures to gain estimation robustness and modeling flexibility. However, by exploring the modeling structures, we find these “sacrifices” do not always require more computational efforts. To shed light on such a “free-lunch” phenomenon, we study the square-root-Lasso (SQRT-Lasso) type regression problem. Specifically, we show that the nonsmooth loss functions of SQRT-Lasso type regression ease tuning effort and gain adaptivity to inhomogeneous noise, but is not necessarily more challenging than Lasso in computation. We can directly apply proximal algorithms (e.g. proximal gradient descent, proximal Newton, and proximal quasi-Newton algorithms) without worrying about the nonsmoothness of the loss function. Theoretically, we prove that the proximal algorithms combined with the pathwise optimization scheme enjoy fast convergence guarantees with high probability. Numerical results are provided to support our theory.

Cite this Paper


BibTeX
@InProceedings{pmlr-v115-li20a, title = {On Fast Convergence of Proximal Algorithms for SQRT-Lasso Optimization: Don’t Worry About its Nonsmooth Loss Function}, author = {Li, Xinguo and Jiang, Haoming and Haupt, Jarvis and Arora, Raman and Liu, Han and Hong, Mingyi and Zhao, Tuo}, booktitle = {Proceedings of The 35th Uncertainty in Artificial Intelligence Conference}, pages = {49--59}, year = {2020}, editor = {Adams, Ryan P. and Gogate, Vibhav}, volume = {115}, series = {Proceedings of Machine Learning Research}, month = {22--25 Jul}, publisher = {PMLR}, pdf = {http://proceedings.mlr.press/v115/li20a/li20a.pdf}, url = {https://proceedings.mlr.press/v115/li20a.html}, abstract = {Many machine learning techniques sacrifice convenient computational structures to gain estimation robustness and modeling flexibility. However, by exploring the modeling structures, we find these “sacrifices” do not always require more computational efforts. To shed light on such a “free-lunch” phenomenon, we study the square-root-Lasso (SQRT-Lasso) type regression problem. Specifically, we show that the nonsmooth loss functions of SQRT-Lasso type regression ease tuning effort and gain adaptivity to inhomogeneous noise, but is not necessarily more challenging than Lasso in computation. We can directly apply proximal algorithms (e.g. proximal gradient descent, proximal Newton, and proximal quasi-Newton algorithms) without worrying about the nonsmoothness of the loss function. Theoretically, we prove that the proximal algorithms combined with the pathwise optimization scheme enjoy fast convergence guarantees with high probability. Numerical results are provided to support our theory.} }
Endnote
%0 Conference Paper %T On Fast Convergence of Proximal Algorithms for SQRT-Lasso Optimization: Don’t Worry About its Nonsmooth Loss Function %A Xinguo Li %A Haoming Jiang %A Jarvis Haupt %A Raman Arora %A Han Liu %A Mingyi Hong %A Tuo Zhao %B Proceedings of The 35th Uncertainty in Artificial Intelligence Conference %C Proceedings of Machine Learning Research %D 2020 %E Ryan P. Adams %E Vibhav Gogate %F pmlr-v115-li20a %I PMLR %P 49--59 %U https://proceedings.mlr.press/v115/li20a.html %V 115 %X Many machine learning techniques sacrifice convenient computational structures to gain estimation robustness and modeling flexibility. However, by exploring the modeling structures, we find these “sacrifices” do not always require more computational efforts. To shed light on such a “free-lunch” phenomenon, we study the square-root-Lasso (SQRT-Lasso) type regression problem. Specifically, we show that the nonsmooth loss functions of SQRT-Lasso type regression ease tuning effort and gain adaptivity to inhomogeneous noise, but is not necessarily more challenging than Lasso in computation. We can directly apply proximal algorithms (e.g. proximal gradient descent, proximal Newton, and proximal quasi-Newton algorithms) without worrying about the nonsmoothness of the loss function. Theoretically, we prove that the proximal algorithms combined with the pathwise optimization scheme enjoy fast convergence guarantees with high probability. Numerical results are provided to support our theory.
APA
Li, X., Jiang, H., Haupt, J., Arora, R., Liu, H., Hong, M. & Zhao, T.. (2020). On Fast Convergence of Proximal Algorithms for SQRT-Lasso Optimization: Don’t Worry About its Nonsmooth Loss Function. Proceedings of The 35th Uncertainty in Artificial Intelligence Conference, in Proceedings of Machine Learning Research 115:49-59 Available from https://proceedings.mlr.press/v115/li20a.html.

Related Material