A Deterministic Streaming Sketch for Ridge Regression

Benwei Shi, Jeff Phillips
Proceedings of The 24th International Conference on Artificial Intelligence and Statistics, PMLR 130:586-594, 2021.

Abstract

We provide a deterministic space-efficient algorithm for estimating ridge regression. For n data points with d features and a large enough regularization parameter, we provide a solution within eps L_2 error using only O(d/eps) space. This is the first o(d^2) space deterministic streaming algorithm with guaranteed solution error and risk bound for this classic problem. The algorithm sketches the covariance matrix by variants of Frequent Directions, which implies it can operate in insertion-only streams and a variety of distributed data settings. In comparisons to randomized sketching algorithms on synthetic and real-world datasets, our algorithm has less empirical error using less space and similar time.

Cite this Paper


BibTeX
@InProceedings{pmlr-v130-shi21b, title = { A Deterministic Streaming Sketch for Ridge Regression }, author = {Shi, Benwei and Phillips, Jeff}, booktitle = {Proceedings of The 24th International Conference on Artificial Intelligence and Statistics}, pages = {586--594}, year = {2021}, editor = {Banerjee, Arindam and Fukumizu, Kenji}, volume = {130}, series = {Proceedings of Machine Learning Research}, month = {13--15 Apr}, publisher = {PMLR}, pdf = {http://proceedings.mlr.press/v130/shi21b/shi21b.pdf}, url = {http://proceedings.mlr.press/v130/shi21b.html}, abstract = { We provide a deterministic space-efficient algorithm for estimating ridge regression. For n data points with d features and a large enough regularization parameter, we provide a solution within eps L_2 error using only O(d/eps) space. This is the first o(d^2) space deterministic streaming algorithm with guaranteed solution error and risk bound for this classic problem. The algorithm sketches the covariance matrix by variants of Frequent Directions, which implies it can operate in insertion-only streams and a variety of distributed data settings. In comparisons to randomized sketching algorithms on synthetic and real-world datasets, our algorithm has less empirical error using less space and similar time. } }
Endnote
%0 Conference Paper %T A Deterministic Streaming Sketch for Ridge Regression %A Benwei Shi %A Jeff Phillips %B Proceedings of The 24th International Conference on Artificial Intelligence and Statistics %C Proceedings of Machine Learning Research %D 2021 %E Arindam Banerjee %E Kenji Fukumizu %F pmlr-v130-shi21b %I PMLR %P 586--594 %U http://proceedings.mlr.press/v130/shi21b.html %V 130 %X We provide a deterministic space-efficient algorithm for estimating ridge regression. For n data points with d features and a large enough regularization parameter, we provide a solution within eps L_2 error using only O(d/eps) space. This is the first o(d^2) space deterministic streaming algorithm with guaranteed solution error and risk bound for this classic problem. The algorithm sketches the covariance matrix by variants of Frequent Directions, which implies it can operate in insertion-only streams and a variety of distributed data settings. In comparisons to randomized sketching algorithms on synthetic and real-world datasets, our algorithm has less empirical error using less space and similar time.
APA
Shi, B. & Phillips, J.. (2021). A Deterministic Streaming Sketch for Ridge Regression . Proceedings of The 24th International Conference on Artificial Intelligence and Statistics, in Proceedings of Machine Learning Research 130:586-594 Available from http://proceedings.mlr.press/v130/shi21b.html.

Related Material