Bayesian Masking: Sparse Bayesian Estimation with Weaker Shrinkage Bias

Yohei Kondo; Shin-ichi Maeda; Kohei Hayashi

Bayesian Masking: Sparse Bayesian Estimation with Weaker Shrinkage Bias

Yohei Kondo, Shin-ichi Maeda, Kohei Hayashi

Asian Conference on Machine Learning, PMLR 45:49-64, 2016.

Abstract

A common strategy for sparse linear regression is to introduce regularization, which eliminates irrelevant features by letting the corresponding weights be zeros. However, regularization often shrinks the estimator for relevant features, which leads to incorrect feature selection. Motivated by the above-mentioned issue, we propose Bayesian masking (BM), a sparse estimation method which imposes no regularization on the weights. The key concept of BM is to introduce binary latent variables that randomly mask features. Estimating the masking rates determines the relevance of the features automatically. We derive a variational Bayesian inference algorithm that maximizes the lower bound of the factorized information criterion (FIC), which is a recently developed asymptotic criterion for evaluating the marginal log-likelihood. In addition, we propose reparametrization to accelerate the convergence of the derived algorithm. Finally, we show that BM outperforms Lasso and automatic relevance determination (ARD) in terms of the sparsity-shrinkage trade-off.

Cite this Paper

BibTeX


@InProceedings{pmlr-v45-Kondo15,
  title = 	 {Bayesian Masking: Sparse Bayesian Estimation with Weaker Shrinkage Bias},
  author = 	 {Kondo, Yohei and Maeda, Shin-ichi and Hayashi, Kohei},
  booktitle = 	 {Asian Conference on Machine Learning},
  pages = 	 {49--64},
  year = 	 {2016},
  editor = 	 {Holmes, Geoffrey and Liu, Tie-Yan},
  volume = 	 {45},
  series = 	 {Proceedings of Machine Learning Research},
  address = 	 {Hong Kong},
  month = 	 {20--22 Nov},
  publisher =    {PMLR},
  pdf = 	 {http://proceedings.mlr.press/v45/Kondo15.pdf},
  url = 	 {https://proceedings.mlr.press/v45/Kondo15.html},
  abstract = 	 {A common strategy for sparse linear regression is to introduce regularization, which eliminates irrelevant features by letting the corresponding weights be zeros. However, regularization often shrinks the estimator for relevant features, which leads to incorrect feature selection. Motivated by the above-mentioned issue, we propose Bayesian masking (BM), a sparse estimation method which imposes no regularization on the weights.  The key concept of BM is to introduce binary latent variables that randomly mask features. Estimating the masking rates determines the relevance of the features automatically. We derive a variational Bayesian inference algorithm that maximizes the lower bound of the factorized information criterion (FIC), which is a recently developed asymptotic criterion for evaluating the marginal log-likelihood. In addition, we propose reparametrization to accelerate the convergence of the derived algorithm. Finally, we show that BM outperforms Lasso and automatic relevance determination (ARD) in terms of the sparsity-shrinkage trade-off. }
}

Endnote

%0 Conference Paper
%T Bayesian Masking: Sparse Bayesian Estimation with Weaker Shrinkage Bias
%A Yohei Kondo
%A Shin-ichi Maeda
%A Kohei Hayashi
%B Asian Conference on Machine Learning
%C Proceedings of Machine Learning Research
%D 2016
%E Geoffrey Holmes
%E Tie-Yan Liu	
%F pmlr-v45-Kondo15
%I PMLR
%P 49--64
%U https://proceedings.mlr.press/v45/Kondo15.html
%V 45
%X A common strategy for sparse linear regression is to introduce regularization, which eliminates irrelevant features by letting the corresponding weights be zeros. However, regularization often shrinks the estimator for relevant features, which leads to incorrect feature selection. Motivated by the above-mentioned issue, we propose Bayesian masking (BM), a sparse estimation method which imposes no regularization on the weights.  The key concept of BM is to introduce binary latent variables that randomly mask features. Estimating the masking rates determines the relevance of the features automatically. We derive a variational Bayesian inference algorithm that maximizes the lower bound of the factorized information criterion (FIC), which is a recently developed asymptotic criterion for evaluating the marginal log-likelihood. In addition, we propose reparametrization to accelerate the convergence of the derived algorithm. Finally, we show that BM outperforms Lasso and automatic relevance determination (ARD) in terms of the sparsity-shrinkage trade-off.

RIS


TY  - CPAPER
TI  - Bayesian Masking: Sparse Bayesian Estimation with Weaker Shrinkage Bias
AU  - Yohei Kondo
AU  - Shin-ichi Maeda
AU  - Kohei Hayashi
BT  - Asian Conference on Machine Learning
DA  - 2016/02/25
ED  - Geoffrey Holmes
ED  - Tie-Yan Liu	
ID  - pmlr-v45-Kondo15
PB  - PMLR
DP  - Proceedings of Machine Learning Research
VL  - 45
SP  - 49
EP  - 64
L1  - http://proceedings.mlr.press/v45/Kondo15.pdf
UR  - https://proceedings.mlr.press/v45/Kondo15.html
AB  - A common strategy for sparse linear regression is to introduce regularization, which eliminates irrelevant features by letting the corresponding weights be zeros. However, regularization often shrinks the estimator for relevant features, which leads to incorrect feature selection. Motivated by the above-mentioned issue, we propose Bayesian masking (BM), a sparse estimation method which imposes no regularization on the weights.  The key concept of BM is to introduce binary latent variables that randomly mask features. Estimating the masking rates determines the relevance of the features automatically. We derive a variational Bayesian inference algorithm that maximizes the lower bound of the factorized information criterion (FIC), which is a recently developed asymptotic criterion for evaluating the marginal log-likelihood. In addition, we propose reparametrization to accelerate the convergence of the derived algorithm. Finally, we show that BM outperforms Lasso and automatic relevance determination (ARD) in terms of the sparsity-shrinkage trade-off. 
ER  -

APA


Kondo, Y., Maeda, S. & Hayashi, K.. (2016). Bayesian Masking: Sparse Bayesian Estimation with Weaker Shrinkage Bias. Asian Conference on Machine Learning, in Proceedings of Machine Learning Research 45:49-64 Available from https://proceedings.mlr.press/v45/Kondo15.html.

Related Material

Download PDF