Quantile Regression for Large-scale Applications

Jiyan Yang; Xiangrui Meng; Michael Mahoney

Quantile Regression for Large-scale Applications

Jiyan Yang, Xiangrui Meng, Michael Mahoney

Proceedings of the 30th International Conference on Machine Learning, PMLR 28(3):881-887, 2013.

Abstract

Quantile regression is a method to estimate the quantiles of the conditional distribution of a response variable, and as such it permits a much more accurate portrayal of the relationship between the response variable and observed covariates than methods such as Least-squares or Least Absolute Deviations regression. It can be expressed as a linear program, and interior-point methods can be used to find a solution for moderately large problems. Dealing with very large problems, \emphe.g., involving data up to and beyond the terabyte regime, remains a challenge. Here, we present a randomized algorithm that runs in time that is nearly linear in the size of the input and that, with constant probability, computes a (1+ε) approximate solution to an arbitrary quantile regression problem. Our algorithm computes a low-distortion subspace-preserving embedding with respect to the loss function of quantile regression. Our empirical evaluation illustrates that our algorithm is competitive with the best previous work on small to medium-sized problems, and that it can be implemented in MapReduce-like environments and applied to terabyte-sized problems.

Cite this Paper

BibTeX


@InProceedings{pmlr-v28-yang13f,
  title = 	 {Quantile Regression for Large-scale Applications},
  author = 	 {Yang, Jiyan and Meng, Xiangrui and Mahoney, Michael},
  booktitle = 	 {Proceedings of the 30th International Conference on Machine Learning},
  pages = 	 {881--887},
  year = 	 {2013},
  editor = 	 {Dasgupta, Sanjoy and McAllester, David},
  volume = 	 {28},
  number =       {3},
  series = 	 {Proceedings of Machine Learning Research},
  address = 	 {Atlanta, Georgia, USA},
  month = 	 {17--19 Jun},
  publisher =    {PMLR},
  pdf = 	 {http://proceedings.mlr.press/v28/yang13f.pdf},
  url = 	 {https://proceedings.mlr.press/v28/yang13f.html},
  abstract = 	 {Quantile regression is a method to estimate the quantiles of the   conditional distribution of a response variable, and as such it permits a   much more accurate portrayal of the relationship between the response variable   and observed covariates than methods such as Least-squares or   Least Absolute Deviations regression.  It can be expressed as a linear program,   and   interior-point methods can be used to find a solution for  moderately large problems.  Dealing with very large problems, \emphe.g., involving data up to and   beyond the terabyte regime, remains a challenge.  Here, we present a randomized algorithm that runs in time that is nearly   linear in the size of the input and that, with constant probability,   computes a (1+ε) approximate solution to an arbitrary quantile   regression problem.  Our algorithm computes a low-distortion subspace-preserving  embedding with respect to the loss function of quantile regression.  Our empirical evaluation illustrates that our algorithm is competitive with   the best previous work on small to medium-sized problems, and that   it can be implemented in MapReduce-like environments and    applied to terabyte-sized problems.}
}

Endnote

%0 Conference Paper
%T Quantile Regression for Large-scale Applications
%A Jiyan Yang
%A Xiangrui Meng
%A Michael Mahoney
%B Proceedings of the 30th International Conference on Machine Learning
%C Proceedings of Machine Learning Research
%D 2013
%E Sanjoy Dasgupta
%E David McAllester	
%F pmlr-v28-yang13f
%I PMLR
%P 881--887
%U https://proceedings.mlr.press/v28/yang13f.html
%V 28
%N 3
%X Quantile regression is a method to estimate the quantiles of the   conditional distribution of a response variable, and as such it permits a   much more accurate portrayal of the relationship between the response variable   and observed covariates than methods such as Least-squares or   Least Absolute Deviations regression.  It can be expressed as a linear program,   and   interior-point methods can be used to find a solution for  moderately large problems.  Dealing with very large problems, \emphe.g., involving data up to and   beyond the terabyte regime, remains a challenge.  Here, we present a randomized algorithm that runs in time that is nearly   linear in the size of the input and that, with constant probability,   computes a (1+ε) approximate solution to an arbitrary quantile   regression problem.  Our algorithm computes a low-distortion subspace-preserving  embedding with respect to the loss function of quantile regression.  Our empirical evaluation illustrates that our algorithm is competitive with   the best previous work on small to medium-sized problems, and that   it can be implemented in MapReduce-like environments and    applied to terabyte-sized problems.

RIS


TY  - CPAPER
TI  - Quantile Regression for Large-scale Applications
AU  - Jiyan Yang
AU  - Xiangrui Meng
AU  - Michael Mahoney
BT  - Proceedings of the 30th International Conference on Machine Learning
DA  - 2013/05/26
ED  - Sanjoy Dasgupta
ED  - David McAllester	
ID  - pmlr-v28-yang13f
PB  - PMLR
DP  - Proceedings of Machine Learning Research
VL  - 28
IS  - 3
SP  - 881
EP  - 887
L1  - http://proceedings.mlr.press/v28/yang13f.pdf
UR  - https://proceedings.mlr.press/v28/yang13f.html
AB  - Quantile regression is a method to estimate the quantiles of the   conditional distribution of a response variable, and as such it permits a   much more accurate portrayal of the relationship between the response variable   and observed covariates than methods such as Least-squares or   Least Absolute Deviations regression.  It can be expressed as a linear program,   and   interior-point methods can be used to find a solution for  moderately large problems.  Dealing with very large problems, \emphe.g., involving data up to and   beyond the terabyte regime, remains a challenge.  Here, we present a randomized algorithm that runs in time that is nearly   linear in the size of the input and that, with constant probability,   computes a (1+ε) approximate solution to an arbitrary quantile   regression problem.  Our algorithm computes a low-distortion subspace-preserving  embedding with respect to the loss function of quantile regression.  Our empirical evaluation illustrates that our algorithm is competitive with   the best previous work on small to medium-sized problems, and that   it can be implemented in MapReduce-like environments and    applied to terabyte-sized problems.
ER  -

APA


Yang, J., Meng, X. & Mahoney, M.. (2013). Quantile Regression for Large-scale Applications. Proceedings of the 30th International Conference on Machine Learning, in Proceedings of Machine Learning Research 28(3):881-887 Available from https://proceedings.mlr.press/v28/yang13f.html.

Related Material

Download PDF