Low-precision arithmetic for fast Gaussian processes

Wesley J. Maddox; Andres Potapcynski; Andrew Gordon Wilson

Low-precision arithmetic for fast Gaussian processes

Wesley J. Maddox, Andres Potapcynski, Andrew Gordon Wilson

Proceedings of the Thirty-Eighth Conference on Uncertainty in Artificial Intelligence, PMLR 180:1306-1316, 2022.

Abstract

Low precision arithmetic has had a transformative effect on the training of neural networks, reducing computation, memory and energy requirements. However, despite their promise, low precision operations have received little attention for Gaussian process (GP) training, largely because GPs require sophisticated linear algebra routines that are unstable in low precision. We study the different failure modes that can occur when training GPs in half-precision. To circumvent these failure modes, we propose a multi-faceted approach involving conjugate gradients with re-orthogonalization, mixed precision, compact kernels, and preconditioners. Our approach significantly improves the numerical stability and practical performance of conjugate gradients in low precision over a wide range of settings, and reduces the runtime of 1.8 million data points to 10 hours on a single GPU, without requiring any sparse approximations.

Cite this Paper

BibTeX


@InProceedings{pmlr-v180-maddox22a,
  title = 	 {Low-precision arithmetic for fast Gaussian processes},
  author =       {Maddox, Wesley J. and Potapcynski, Andres and Wilson, Andrew Gordon},
  booktitle = 	 {Proceedings of the Thirty-Eighth Conference on Uncertainty in Artificial Intelligence},
  pages = 	 {1306--1316},
  year = 	 {2022},
  editor = 	 {Cussens, James and Zhang, Kun},
  volume = 	 {180},
  series = 	 {Proceedings of Machine Learning Research},
  month = 	 {01--05 Aug},
  publisher =    {PMLR},
  pdf = 	 {https://proceedings.mlr.press/v180/maddox22a/maddox22a.pdf},
  url = 	 {https://proceedings.mlr.press/v180/maddox22a.html},
  abstract = 	 {Low precision arithmetic has had a transformative effect on the training of neural networks, reducing computation, memory and energy requirements. However, despite their promise, low precision operations have received little attention for Gaussian process (GP) training, largely because GPs require sophisticated linear algebra routines that are unstable in low precision. We study the different failure modes that can occur when training GPs in half-precision. To circumvent these failure modes, we propose a multi-faceted approach involving conjugate gradients with re-orthogonalization, mixed precision, compact kernels, and preconditioners. Our approach significantly improves the numerical stability and practical performance of conjugate gradients in low precision over a wide range of settings, and reduces the runtime of 1.8 million data points to 10 hours on a single GPU, without requiring any sparse approximations.}
}

Endnote

%0 Conference Paper
%T Low-precision arithmetic for fast Gaussian processes
%A Wesley J. Maddox
%A Andres Potapcynski
%A Andrew Gordon Wilson
%B Proceedings of the Thirty-Eighth Conference on Uncertainty in Artificial Intelligence
%C Proceedings of Machine Learning Research
%D 2022
%E James Cussens
%E Kun Zhang	
%F pmlr-v180-maddox22a
%I PMLR
%P 1306--1316
%U https://proceedings.mlr.press/v180/maddox22a.html
%V 180
%X Low precision arithmetic has had a transformative effect on the training of neural networks, reducing computation, memory and energy requirements. However, despite their promise, low precision operations have received little attention for Gaussian process (GP) training, largely because GPs require sophisticated linear algebra routines that are unstable in low precision. We study the different failure modes that can occur when training GPs in half-precision. To circumvent these failure modes, we propose a multi-faceted approach involving conjugate gradients with re-orthogonalization, mixed precision, compact kernels, and preconditioners. Our approach significantly improves the numerical stability and practical performance of conjugate gradients in low precision over a wide range of settings, and reduces the runtime of 1.8 million data points to 10 hours on a single GPU, without requiring any sparse approximations.

APA


Maddox, W.J., Potapcynski, A. & Wilson, A.G.. (2022). Low-precision arithmetic for fast Gaussian processes. Proceedings of the Thirty-Eighth Conference on Uncertainty in Artificial Intelligence, in Proceedings of Machine Learning Research 180:1306-1316 Available from https://proceedings.mlr.press/v180/maddox22a.html.

Low-precision arithmetic for fast Gaussian processes

Abstract

Cite this Paper

Related Material