Scalable Spike-and-Slab

Niloy Biswas; Lester Mackey; Xiao-Li Meng

Scalable Spike-and-Slab

Niloy Biswas, Lester Mackey, Xiao-Li Meng

Proceedings of the 39th International Conference on Machine Learning, PMLR 162:2021-2040, 2022.

Abstract

Spike-and-slab priors are commonly used for Bayesian variable selection, due to their interpretability and favorable statistical properties. However, existing samplers for spike-and-slab posteriors incur prohibitive computational costs when the number of variables is large. In this article, we propose Scalable Spike-and-Slab (S^3), a scalable Gibbs sampling implementation for high-dimensional Bayesian regression with the continuous spike-and-slab prior of George & McCulloch (1993). For a dataset with n observations and p covariates, S^3 has order max{n^2 p_t, np} computational cost at iteration t where p_t never exceeds the number of covariates switching spike-and-slab states between iterations t and t-1 of the Markov chain. This improves upon the order n^2 p per-iteration cost of state-of-the-art implementations as, typically, p_t is substantially smaller than p. We apply S^3 on synthetic and real-world datasets, demonstrating orders of magnitude speed-ups over existing exact samplers and significant gains in inferential quality over approximate samplers with comparable cost.

Cite this Paper

BibTeX

@InProceedings{pmlr-v162-biswas22a,
  title = 	 {Scalable Spike-and-Slab},
  author =       {Biswas, Niloy and Mackey, Lester and Meng, Xiao-Li},
  booktitle = 	 {Proceedings of the 39th International Conference on Machine Learning},
  pages = 	 {2021--2040},
  year = 	 {2022},
  editor = 	 {Chaudhuri, Kamalika and Jegelka, Stefanie and Song, Le and Szepesvari, Csaba and Niu, Gang and Sabato, Sivan},
  volume = 	 {162},
  series = 	 {Proceedings of Machine Learning Research},
  month = 	 {17--23 Jul},
  publisher =    {PMLR},
  pdf = 	 {https://proceedings.mlr.press/v162/biswas22a/biswas22a.pdf},
  url = 	 {https://proceedings.mlr.press/v162/biswas22a.html},
  abstract = 	 {Spike-and-slab priors are commonly used for Bayesian variable selection, due to their interpretability and favorable statistical properties. However, existing samplers for spike-and-slab posteriors incur prohibitive computational costs when the number of variables is large. In this article, we propose Scalable Spike-and-Slab (S^3), a scalable Gibbs sampling implementation for high-dimensional Bayesian regression with the continuous spike-and-slab prior of George & McCulloch (1993). For a dataset with n observations and p covariates, S^3 has order max{n^2 p_t, np} computational cost at iteration t where p_t never exceeds the number of covariates switching spike-and-slab states between iterations t and t-1 of the Markov chain. This improves upon the order n^2 p per-iteration cost of state-of-the-art implementations as, typically, p_t is substantially smaller than p. We apply S^3 on synthetic and real-world datasets, demonstrating orders of magnitude speed-ups over existing exact samplers and significant gains in inferential quality over approximate samplers with comparable cost.}
}

Endnote

%0 Conference Paper
%T Scalable Spike-and-Slab
%A Niloy Biswas
%A Lester Mackey
%A Xiao-Li Meng
%B Proceedings of the 39th International Conference on Machine Learning
%C Proceedings of Machine Learning Research
%D 2022
%E Kamalika Chaudhuri
%E Stefanie Jegelka
%E Le Song
%E Csaba Szepesvari
%E Gang Niu
%E Sivan Sabato	
%F pmlr-v162-biswas22a
%I PMLR
%P 2021--2040
%U https://proceedings.mlr.press/v162/biswas22a.html
%V 162
%X Spike-and-slab priors are commonly used for Bayesian variable selection, due to their interpretability and favorable statistical properties. However, existing samplers for spike-and-slab posteriors incur prohibitive computational costs when the number of variables is large. In this article, we propose Scalable Spike-and-Slab (S^3), a scalable Gibbs sampling implementation for high-dimensional Bayesian regression with the continuous spike-and-slab prior of George & McCulloch (1993). For a dataset with n observations and p covariates, S^3 has order max{n^2 p_t, np} computational cost at iteration t where p_t never exceeds the number of covariates switching spike-and-slab states between iterations t and t-1 of the Markov chain. This improves upon the order n^2 p per-iteration cost of state-of-the-art implementations as, typically, p_t is substantially smaller than p. We apply S^3 on synthetic and real-world datasets, demonstrating orders of magnitude speed-ups over existing exact samplers and significant gains in inferential quality over approximate samplers with comparable cost.

APA

Biswas, N., Mackey, L. & Meng, X.. (2022). Scalable Spike-and-Slab. Proceedings of the 39th International Conference on Machine Learning, in Proceedings of Machine Learning Research 162:2021-2040 Available from https://proceedings.mlr.press/v162/biswas22a.html.

Related Material

Download PDF