Statistical analysis of stochastic gradient methods for generalized linear models

Panagiotis Toulis, Edoardo Airoldi, Jason Rennie
Proceedings of the 31st International Conference on Machine Learning, PMLR 32(2):667-675, 2014.

Abstract

We study the statistical properties of stochastic gradient descent (SGD) using explicit and implicit updates for fitting generalized linear models (GLMs). Initially, we develop a computationally efficient algorithm to implement implicit SGD learning of GLMs. Next, we obtain exact formulas for the bias and variance of both updates which leads to two important observations on their comparative statistical properties. First, in small samples, the estimates from the implicit procedure are more biased than the estimates from the explicit one, but their empirical variance is smaller and they are more robust to learning rate misspecification. Second, the two procedures are statistically identical in the limit: they are both unbiased, converge at the same rate and have the same asymptotic variance. Our set of experiments confirm our theory and more broadly suggest that the implicit procedure can be a competitive choice for fitting large-scale models, especially when robustness is a concern.

Cite this Paper


BibTeX
@InProceedings{pmlr-v32-toulis14, title = {Statistical analysis of stochastic gradient methods for generalized linear models}, author = {Toulis, Panagiotis and Airoldi, Edoardo and Rennie, Jason}, booktitle = {Proceedings of the 31st International Conference on Machine Learning}, pages = {667--675}, year = {2014}, editor = {Xing, Eric P. and Jebara, Tony}, volume = {32}, number = {2}, series = {Proceedings of Machine Learning Research}, address = {Bejing, China}, month = {22--24 Jun}, publisher = {PMLR}, pdf = {http://proceedings.mlr.press/v32/toulis14.pdf}, url = {https://proceedings.mlr.press/v32/toulis14.html}, abstract = {We study the statistical properties of stochastic gradient descent (SGD) using explicit and implicit updates for fitting generalized linear models (GLMs). Initially, we develop a computationally efficient algorithm to implement implicit SGD learning of GLMs. Next, we obtain exact formulas for the bias and variance of both updates which leads to two important observations on their comparative statistical properties. First, in small samples, the estimates from the implicit procedure are more biased than the estimates from the explicit one, but their empirical variance is smaller and they are more robust to learning rate misspecification. Second, the two procedures are statistically identical in the limit: they are both unbiased, converge at the same rate and have the same asymptotic variance. Our set of experiments confirm our theory and more broadly suggest that the implicit procedure can be a competitive choice for fitting large-scale models, especially when robustness is a concern.} }
Endnote
%0 Conference Paper %T Statistical analysis of stochastic gradient methods for generalized linear models %A Panagiotis Toulis %A Edoardo Airoldi %A Jason Rennie %B Proceedings of the 31st International Conference on Machine Learning %C Proceedings of Machine Learning Research %D 2014 %E Eric P. Xing %E Tony Jebara %F pmlr-v32-toulis14 %I PMLR %P 667--675 %U https://proceedings.mlr.press/v32/toulis14.html %V 32 %N 2 %X We study the statistical properties of stochastic gradient descent (SGD) using explicit and implicit updates for fitting generalized linear models (GLMs). Initially, we develop a computationally efficient algorithm to implement implicit SGD learning of GLMs. Next, we obtain exact formulas for the bias and variance of both updates which leads to two important observations on their comparative statistical properties. First, in small samples, the estimates from the implicit procedure are more biased than the estimates from the explicit one, but their empirical variance is smaller and they are more robust to learning rate misspecification. Second, the two procedures are statistically identical in the limit: they are both unbiased, converge at the same rate and have the same asymptotic variance. Our set of experiments confirm our theory and more broadly suggest that the implicit procedure can be a competitive choice for fitting large-scale models, especially when robustness is a concern.
RIS
TY - CPAPER TI - Statistical analysis of stochastic gradient methods for generalized linear models AU - Panagiotis Toulis AU - Edoardo Airoldi AU - Jason Rennie BT - Proceedings of the 31st International Conference on Machine Learning DA - 2014/06/18 ED - Eric P. Xing ED - Tony Jebara ID - pmlr-v32-toulis14 PB - PMLR DP - Proceedings of Machine Learning Research VL - 32 IS - 2 SP - 667 EP - 675 L1 - http://proceedings.mlr.press/v32/toulis14.pdf UR - https://proceedings.mlr.press/v32/toulis14.html AB - We study the statistical properties of stochastic gradient descent (SGD) using explicit and implicit updates for fitting generalized linear models (GLMs). Initially, we develop a computationally efficient algorithm to implement implicit SGD learning of GLMs. Next, we obtain exact formulas for the bias and variance of both updates which leads to two important observations on their comparative statistical properties. First, in small samples, the estimates from the implicit procedure are more biased than the estimates from the explicit one, but their empirical variance is smaller and they are more robust to learning rate misspecification. Second, the two procedures are statistically identical in the limit: they are both unbiased, converge at the same rate and have the same asymptotic variance. Our set of experiments confirm our theory and more broadly suggest that the implicit procedure can be a competitive choice for fitting large-scale models, especially when robustness is a concern. ER -
APA
Toulis, P., Airoldi, E. & Rennie, J.. (2014). Statistical analysis of stochastic gradient methods for generalized linear models. Proceedings of the 31st International Conference on Machine Learning, in Proceedings of Machine Learning Research 32(2):667-675 Available from https://proceedings.mlr.press/v32/toulis14.html.

Related Material