Computationally Efficient Robust Sparse Estimation in High Dimensions

Sivaraman Balakrishnan, Simon S. Du, Jerry Li, Aarti Singh
Proceedings of the 2017 Conference on Learning Theory, PMLR 65:169-212, 2017.

Abstract

Many conventional statistical procedures are extremely sensitive to seemingly minor deviations from modeling assumptions. This problem is exacerbated in modern high-dimensional settings, where the problem dimension can grow with and possibly exceed the sample size. We consider the problem of robust estimation of sparse functionals, and provide a computationally and statistically efficient algorithm in the high-dimensional setting. Our theory identifies a unified set of deterministic conditions under which our algorithm guarantees accurate recovery. By further establishing that these deterministic conditions hold with high-probability for a wide range of statistical models, our theory applies to many problems of considerable interest including sparse mean and covariance estimation; sparse linear regression; and sparse generalized linear models. In certain settings, such as the detection and estimation of sparse principal components in the spiked covariance model, our general theory does not yield optimal sample complexity, and we provide a novel algorithm based on the same intuition which is able to take advantage of further structure of the problem to achieve nearly optimal rates.

Cite this Paper


BibTeX
@InProceedings{pmlr-v65-balakrishnan17a, title = {Computationally Efficient Robust Sparse Estimation in High Dimensions}, author = {Balakrishnan, Sivaraman and Du, Simon S. and Li, Jerry and Singh, Aarti}, booktitle = {Proceedings of the 2017 Conference on Learning Theory}, pages = {169--212}, year = {2017}, editor = {Kale, Satyen and Shamir, Ohad}, volume = {65}, series = {Proceedings of Machine Learning Research}, month = {07--10 Jul}, publisher = {PMLR}, pdf = {http://proceedings.mlr.press/v65/balakrishnan17a/balakrishnan17a.pdf}, url = {https://proceedings.mlr.press/v65/balakrishnan17a.html}, abstract = {Many conventional statistical procedures are extremely sensitive to seemingly minor deviations from modeling assumptions. This problem is exacerbated in modern high-dimensional settings, where the problem dimension can grow with and possibly exceed the sample size. We consider the problem of robust estimation of sparse functionals, and provide a computationally and statistically efficient algorithm in the high-dimensional setting. Our theory identifies a unified set of deterministic conditions under which our algorithm guarantees accurate recovery. By further establishing that these deterministic conditions hold with high-probability for a wide range of statistical models, our theory applies to many problems of considerable interest including sparse mean and covariance estimation; sparse linear regression; and sparse generalized linear models. In certain settings, such as the detection and estimation of sparse principal components in the spiked covariance model, our general theory does not yield optimal sample complexity, and we provide a novel algorithm based on the same intuition which is able to take advantage of further structure of the problem to achieve nearly optimal rates.} }
Endnote
%0 Conference Paper %T Computationally Efficient Robust Sparse Estimation in High Dimensions %A Sivaraman Balakrishnan %A Simon S. Du %A Jerry Li %A Aarti Singh %B Proceedings of the 2017 Conference on Learning Theory %C Proceedings of Machine Learning Research %D 2017 %E Satyen Kale %E Ohad Shamir %F pmlr-v65-balakrishnan17a %I PMLR %P 169--212 %U https://proceedings.mlr.press/v65/balakrishnan17a.html %V 65 %X Many conventional statistical procedures are extremely sensitive to seemingly minor deviations from modeling assumptions. This problem is exacerbated in modern high-dimensional settings, where the problem dimension can grow with and possibly exceed the sample size. We consider the problem of robust estimation of sparse functionals, and provide a computationally and statistically efficient algorithm in the high-dimensional setting. Our theory identifies a unified set of deterministic conditions under which our algorithm guarantees accurate recovery. By further establishing that these deterministic conditions hold with high-probability for a wide range of statistical models, our theory applies to many problems of considerable interest including sparse mean and covariance estimation; sparse linear regression; and sparse generalized linear models. In certain settings, such as the detection and estimation of sparse principal components in the spiked covariance model, our general theory does not yield optimal sample complexity, and we provide a novel algorithm based on the same intuition which is able to take advantage of further structure of the problem to achieve nearly optimal rates.
APA
Balakrishnan, S., Du, S.S., Li, J. & Singh, A.. (2017). Computationally Efficient Robust Sparse Estimation in High Dimensions. Proceedings of the 2017 Conference on Learning Theory, in Proceedings of Machine Learning Research 65:169-212 Available from https://proceedings.mlr.press/v65/balakrishnan17a.html.

Related Material