Householder Sketch for Accurate and Accelerated Least-Mean-Squares Solvers

Jyotikrishna Dass; Rabi Mahapatra

Householder Sketch for Accurate and Accelerated Least-Mean-Squares Solvers

Jyotikrishna Dass, Rabi Mahapatra

Proceedings of the 38th International Conference on Machine Learning, PMLR 139:2467-2477, 2021.

Abstract

Least-Mean-Squares (\textsc{LMS}) solvers comprise a class of fundamental optimization problems such as linear regression, and regularized regressions such as Ridge, LASSO, and Elastic-Net. Data summarization techniques for big data generate summaries called coresets and sketches to speed up model learning under streaming and distributed settings. For example, \citep{nips2019} design a fast and accurate Caratheodory set on input data to boost the performance of existing \textsc{LMS} solvers. In retrospect, we explore classical Householder transformation as a candidate for sketching and accurately solving LMS problems. We find it to be a simpler, memory-efficient, and faster alternative that always existed to the above strong baseline. We also present a scalable algorithm based on the construction of distributed Householder sketches to solve \textsc{LMS} problem across multiple worker nodes. We perform thorough empirical analysis with large synthetic and real datasets to evaluate the performance of Householder sketch and compare with \citep{nips2019}. Our results show Householder sketch speeds up existing \textsc{LMS} solvers in the scikit-learn library up to $100$x-$400$x. Also, it is $10$x-$100$x faster than the above baseline with similar numerical stability. The distributed algorithm demonstrates linear scalability with a near-negligible communication overhead.

Cite this Paper

BibTeX


@InProceedings{pmlr-v139-dass21a,
  title = 	 {Householder Sketch for Accurate and Accelerated Least-Mean-Squares Solvers},
  author =       {Dass, Jyotikrishna and Mahapatra, Rabi},
  booktitle = 	 {Proceedings of the 38th International Conference on Machine Learning},
  pages = 	 {2467--2477},
  year = 	 {2021},
  editor = 	 {Meila, Marina and Zhang, Tong},
  volume = 	 {139},
  series = 	 {Proceedings of Machine Learning Research},
  month = 	 {18--24 Jul},
  publisher =    {PMLR},
  pdf = 	 {http://proceedings.mlr.press/v139/dass21a/dass21a.pdf},
  url = 	 {https://proceedings.mlr.press/v139/dass21a.html},
  abstract = 	 {Least-Mean-Squares (\textsc{LMS}) solvers comprise a class of fundamental optimization problems such as linear regression, and regularized regressions such as Ridge, LASSO, and Elastic-Net. Data summarization techniques for big data generate summaries called coresets and sketches to speed up model learning under streaming and distributed settings. For example, \citep{nips2019} design a fast and accurate Caratheodory set on input data to boost the performance of existing \textsc{LMS} solvers. In retrospect, we explore classical Householder transformation as a candidate for sketching and accurately solving LMS problems. We find it to be a simpler, memory-efficient, and faster alternative that always existed to the above strong baseline. We also present a scalable algorithm based on the construction of distributed Householder sketches to solve \textsc{LMS} problem across multiple worker nodes. We perform thorough empirical analysis with large synthetic and real datasets to evaluate the performance of Householder sketch and compare with \citep{nips2019}. Our results show Householder sketch speeds up existing \textsc{LMS} solvers in the scikit-learn library up to $100$x-$400$x. Also, it is $10$x-$100$x faster than the above baseline with similar numerical stability. The distributed algorithm demonstrates linear scalability with a near-negligible communication overhead.}
}

Endnote

%0 Conference Paper
%T Householder Sketch for Accurate and Accelerated Least-Mean-Squares Solvers
%A Jyotikrishna Dass
%A Rabi Mahapatra
%B Proceedings of the 38th International Conference on Machine Learning
%C Proceedings of Machine Learning Research
%D 2021
%E Marina Meila
%E Tong Zhang	
%F pmlr-v139-dass21a
%I PMLR
%P 2467--2477
%U https://proceedings.mlr.press/v139/dass21a.html
%V 139
%X Least-Mean-Squares (\textsc{LMS}) solvers comprise a class of fundamental optimization problems such as linear regression, and regularized regressions such as Ridge, LASSO, and Elastic-Net. Data summarization techniques for big data generate summaries called coresets and sketches to speed up model learning under streaming and distributed settings. For example, \citep{nips2019} design a fast and accurate Caratheodory set on input data to boost the performance of existing \textsc{LMS} solvers. In retrospect, we explore classical Householder transformation as a candidate for sketching and accurately solving LMS problems. We find it to be a simpler, memory-efficient, and faster alternative that always existed to the above strong baseline. We also present a scalable algorithm based on the construction of distributed Householder sketches to solve \textsc{LMS} problem across multiple worker nodes. We perform thorough empirical analysis with large synthetic and real datasets to evaluate the performance of Householder sketch and compare with \citep{nips2019}. Our results show Householder sketch speeds up existing \textsc{LMS} solvers in the scikit-learn library up to $100$x-$400$x. Also, it is $10$x-$100$x faster than the above baseline with similar numerical stability. The distributed algorithm demonstrates linear scalability with a near-negligible communication overhead.

APA


Dass, J. & Mahapatra, R.. (2021). Householder Sketch for Accurate and Accelerated Least-Mean-Squares Solvers. Proceedings of the 38th International Conference on Machine Learning, in Proceedings of Machine Learning Research 139:2467-2477 Available from https://proceedings.mlr.press/v139/dass21a.html.

Householder Sketch for Accurate and Accelerated Least-Mean-Squares Solvers

Abstract

Cite this Paper

Related Material