Statistical analysis of stochastic gradient methods for generalized linear models

Panagiotis Toulis; Edoardo Airoldi; Jason Rennie

Statistical analysis of stochastic gradient methods for generalized linear models

Panagiotis Toulis, Edoardo Airoldi, Jason Rennie

Proceedings of the 31st International Conference on Machine Learning, PMLR 32(2):667-675, 2014.

Abstract

We study the statistical properties of stochastic gradient descent (SGD) using explicit and implicit updates for fitting generalized linear models (GLMs). Initially, we develop a computationally efficient algorithm to implement implicit SGD learning of GLMs. Next, we obtain exact formulas for the bias and variance of both updates which leads to two important observations on their comparative statistical properties. First, in small samples, the estimates from the implicit procedure are more biased than the estimates from the explicit one, but their empirical variance is smaller and they are more robust to learning rate misspecification. Second, the two procedures are statistically identical in the limit: they are both unbiased, converge at the same rate and have the same asymptotic variance. Our set of experiments confirm our theory and more broadly suggest that the implicit procedure can be a competitive choice for fitting large-scale models, especially when robustness is a concern.

Cite this Paper

BibTeX


@InProceedings{pmlr-v32-toulis14,
  title = 	 {Statistical analysis of stochastic gradient methods for generalized linear models},
  author = 	 {Toulis, Panagiotis and Airoldi, Edoardo and Rennie, Jason},
  booktitle = 	 {Proceedings of the 31st International Conference on Machine Learning},
  pages = 	 {667--675},
  year = 	 {2014},
  editor = 	 {Xing, Eric P. and Jebara, Tony},
  volume = 	 {32},
  number =       {2},
  series = 	 {Proceedings of Machine Learning Research},
  address = 	 {Bejing, China},
  month = 	 {22--24 Jun},
  publisher =    {PMLR},
  pdf = 	 {http://proceedings.mlr.press/v32/toulis14.pdf},
  url = 	 {https://proceedings.mlr.press/v32/toulis14.html},
  abstract = 	 {We study the statistical properties of stochastic gradient descent (SGD) using   explicit and implicit updates for fitting generalized linear models (GLMs).  Initially, we develop a computationally   efficient algorithm to implement implicit SGD learning of GLMs.  Next, we obtain exact formulas for the bias and variance  of both updates which leads to two important observations on their   comparative statistical properties.  First, in small samples, the estimates from the implicit procedure   are more biased than the estimates from the explicit one,   but their empirical variance is smaller and they are more robust to   learning rate misspecification.   Second, the two procedures are statistically identical in the limit:   they are both unbiased, converge at the same rate and have the   same asymptotic variance. Our set of experiments confirm our theory and   more broadly suggest that the implicit procedure can be a competitive choice   for fitting large-scale  models, especially when robustness is a concern.}
}

Endnote

%0 Conference Paper
%T Statistical analysis of stochastic gradient methods for generalized linear models
%A Panagiotis Toulis
%A Edoardo Airoldi
%A Jason Rennie
%B Proceedings of the 31st International Conference on Machine Learning
%C Proceedings of Machine Learning Research
%D 2014
%E Eric P. Xing
%E Tony Jebara	
%F pmlr-v32-toulis14
%I PMLR
%P 667--675
%U https://proceedings.mlr.press/v32/toulis14.html
%V 32
%N 2
%X We study the statistical properties of stochastic gradient descent (SGD) using   explicit and implicit updates for fitting generalized linear models (GLMs).  Initially, we develop a computationally   efficient algorithm to implement implicit SGD learning of GLMs.  Next, we obtain exact formulas for the bias and variance  of both updates which leads to two important observations on their   comparative statistical properties.  First, in small samples, the estimates from the implicit procedure   are more biased than the estimates from the explicit one,   but their empirical variance is smaller and they are more robust to   learning rate misspecification.   Second, the two procedures are statistically identical in the limit:   they are both unbiased, converge at the same rate and have the   same asymptotic variance. Our set of experiments confirm our theory and   more broadly suggest that the implicit procedure can be a competitive choice   for fitting large-scale  models, especially when robustness is a concern.

RIS


TY  - CPAPER
TI  - Statistical analysis of stochastic gradient methods for generalized linear models
AU  - Panagiotis Toulis
AU  - Edoardo Airoldi
AU  - Jason Rennie
BT  - Proceedings of the 31st International Conference on Machine Learning
DA  - 2014/06/18
ED  - Eric P. Xing
ED  - Tony Jebara	
ID  - pmlr-v32-toulis14
PB  - PMLR
DP  - Proceedings of Machine Learning Research
VL  - 32
IS  - 2
SP  - 667
EP  - 675
L1  - http://proceedings.mlr.press/v32/toulis14.pdf
UR  - https://proceedings.mlr.press/v32/toulis14.html
AB  - We study the statistical properties of stochastic gradient descent (SGD) using   explicit and implicit updates for fitting generalized linear models (GLMs).  Initially, we develop a computationally   efficient algorithm to implement implicit SGD learning of GLMs.  Next, we obtain exact formulas for the bias and variance  of both updates which leads to two important observations on their   comparative statistical properties.  First, in small samples, the estimates from the implicit procedure   are more biased than the estimates from the explicit one,   but their empirical variance is smaller and they are more robust to   learning rate misspecification.   Second, the two procedures are statistically identical in the limit:   they are both unbiased, converge at the same rate and have the   same asymptotic variance. Our set of experiments confirm our theory and   more broadly suggest that the implicit procedure can be a competitive choice   for fitting large-scale  models, especially when robustness is a concern.
ER  -

APA


Toulis, P., Airoldi, E. & Rennie, J.. (2014). Statistical analysis of stochastic gradient methods for generalized linear models. Proceedings of the 31st International Conference on Machine Learning, in Proceedings of Machine Learning Research 32(2):667-675 Available from https://proceedings.mlr.press/v32/toulis14.html.

Related Material

Download PDF