p-Generalized Probit Regression and Scalable Maximum Likelihood Estimation via Sketching and Coresets

Alexander Munteanu, Simon Omlor, Christian Peters
Proceedings of The 25th International Conference on Artificial Intelligence and Statistics, PMLR 151:2073-2100, 2022.

Abstract

We study the $p$-generalized probit regression model, which is a generalized linear model for binary responses. It extends the standard probit model by replacing its link function, the standard normal cdf, by a $p$-generalized normal distribution for $p\in[1, \infty)$. The $p$-generalized normal distributions (Subbotin, 1923) are of special interest in statistical modeling because they fit much more flexibly to data. Their tail behavior can be controlled by choice of the parameter $p$, which influences the model’s sensitivity to outliers. Special cases include the Laplace, the Gaussian, and the uniform distributions. We further show how the maximum likelihood estimator for $p$-generalized probit regression can be approximated efficiently up to a factor of $(1+\varepsilon)$ on large data by combining sketching techniques with importance subsampling to obtain a small data summary called coreset.

Cite this Paper


BibTeX
@InProceedings{pmlr-v151-munteanu22a, title = { p-Generalized Probit Regression and Scalable Maximum Likelihood Estimation via Sketching and Coresets }, author = {Munteanu, Alexander and Omlor, Simon and Peters, Christian}, booktitle = {Proceedings of The 25th International Conference on Artificial Intelligence and Statistics}, pages = {2073--2100}, year = {2022}, editor = {Camps-Valls, Gustau and Ruiz, Francisco J. R. and Valera, Isabel}, volume = {151}, series = {Proceedings of Machine Learning Research}, month = {28--30 Mar}, publisher = {PMLR}, pdf = {https://proceedings.mlr.press/v151/munteanu22a/munteanu22a.pdf}, url = {https://proceedings.mlr.press/v151/munteanu22a.html}, abstract = { We study the $p$-generalized probit regression model, which is a generalized linear model for binary responses. It extends the standard probit model by replacing its link function, the standard normal cdf, by a $p$-generalized normal distribution for $p\in[1, \infty)$. The $p$-generalized normal distributions (Subbotin, 1923) are of special interest in statistical modeling because they fit much more flexibly to data. Their tail behavior can be controlled by choice of the parameter $p$, which influences the model’s sensitivity to outliers. Special cases include the Laplace, the Gaussian, and the uniform distributions. We further show how the maximum likelihood estimator for $p$-generalized probit regression can be approximated efficiently up to a factor of $(1+\varepsilon)$ on large data by combining sketching techniques with importance subsampling to obtain a small data summary called coreset. } }
Endnote
%0 Conference Paper %T p-Generalized Probit Regression and Scalable Maximum Likelihood Estimation via Sketching and Coresets %A Alexander Munteanu %A Simon Omlor %A Christian Peters %B Proceedings of The 25th International Conference on Artificial Intelligence and Statistics %C Proceedings of Machine Learning Research %D 2022 %E Gustau Camps-Valls %E Francisco J. R. Ruiz %E Isabel Valera %F pmlr-v151-munteanu22a %I PMLR %P 2073--2100 %U https://proceedings.mlr.press/v151/munteanu22a.html %V 151 %X We study the $p$-generalized probit regression model, which is a generalized linear model for binary responses. It extends the standard probit model by replacing its link function, the standard normal cdf, by a $p$-generalized normal distribution for $p\in[1, \infty)$. The $p$-generalized normal distributions (Subbotin, 1923) are of special interest in statistical modeling because they fit much more flexibly to data. Their tail behavior can be controlled by choice of the parameter $p$, which influences the model’s sensitivity to outliers. Special cases include the Laplace, the Gaussian, and the uniform distributions. We further show how the maximum likelihood estimator for $p$-generalized probit regression can be approximated efficiently up to a factor of $(1+\varepsilon)$ on large data by combining sketching techniques with importance subsampling to obtain a small data summary called coreset.
APA
Munteanu, A., Omlor, S. & Peters, C.. (2022). p-Generalized Probit Regression and Scalable Maximum Likelihood Estimation via Sketching and Coresets . Proceedings of The 25th International Conference on Artificial Intelligence and Statistics, in Proceedings of Machine Learning Research 151:2073-2100 Available from https://proceedings.mlr.press/v151/munteanu22a.html.

Related Material