A Light Touch for Heavily Constrained SGD

Andrew Cotter; Maya Gupta; Jan Pfeifer

A Light Touch for Heavily Constrained SGD

Andrew Cotter, Maya Gupta, Jan Pfeifer

29th Annual Conference on Learning Theory, PMLR 49:729-771, 2016.

Abstract

Minimizing empirical risk subject to a set of constraints can be a useful strategy for learning restricted classes of functions, such as monotonic functions, submodular functions, classifiers that guarantee a certain class label for some subset of examples, etc. However, these restrictions may result in a very large number of constraints. Projected stochastic gradient descent (SGD) is often the default choice for large-scale optimization in machine learning, but requires a projection after each update. For heavily-constrained objectives, we propose an efficient extension of SGD that stays close to the feasible region while only applying constraints probabilistically at each iteration. Theoretical analysis shows a compelling trade-off between per-iteration work and the number of iterations needed on problems with a large number of constraints.

Cite this Paper

BibTeX


@InProceedings{pmlr-v49-cotter16,
  title = 	 {A Light Touch for Heavily Constrained SGD},
  author = 	 {Cotter, Andrew and Gupta, Maya and Pfeifer, Jan},
  booktitle = 	 {29th Annual Conference on Learning Theory},
  pages = 	 {729--771},
  year = 	 {2016},
  editor = 	 {Feldman, Vitaly and Rakhlin, Alexander and Shamir, Ohad},
  volume = 	 {49},
  series = 	 {Proceedings of Machine Learning Research},
  address = 	 {Columbia University, New York, New York, USA},
  month = 	 {23--26 Jun},
  publisher =    {PMLR},
  pdf = 	 {http://proceedings.mlr.press/v49/cotter16.pdf},
  url = 	 {https://proceedings.mlr.press/v49/cotter16.html},
  abstract = 	 {Minimizing empirical risk subject to a set of constraints can be a useful strategy for learning restricted classes of functions, such as monotonic functions, submodular functions, classifiers that guarantee a certain class label for some subset of examples, etc. However, these restrictions may result in a very large number of constraints. Projected stochastic gradient descent (SGD) is often the default choice for large-scale optimization in machine learning, but requires a projection after each update. For heavily-constrained objectives, we propose an efficient extension of SGD that stays close to the feasible region while only applying constraints probabilistically at each iteration. Theoretical analysis shows a compelling trade-off between per-iteration work and the number of iterations needed on problems with a large number of constraints.}
}

Endnote

%0 Conference Paper
%T A Light Touch for Heavily Constrained SGD
%A Andrew Cotter
%A Maya Gupta
%A Jan Pfeifer
%B 29th Annual Conference on Learning Theory
%C Proceedings of Machine Learning Research
%D 2016
%E Vitaly Feldman
%E Alexander Rakhlin
%E Ohad Shamir	
%F pmlr-v49-cotter16
%I PMLR
%P 729--771
%U https://proceedings.mlr.press/v49/cotter16.html
%V 49
%X Minimizing empirical risk subject to a set of constraints can be a useful strategy for learning restricted classes of functions, such as monotonic functions, submodular functions, classifiers that guarantee a certain class label for some subset of examples, etc. However, these restrictions may result in a very large number of constraints. Projected stochastic gradient descent (SGD) is often the default choice for large-scale optimization in machine learning, but requires a projection after each update. For heavily-constrained objectives, we propose an efficient extension of SGD that stays close to the feasible region while only applying constraints probabilistically at each iteration. Theoretical analysis shows a compelling trade-off between per-iteration work and the number of iterations needed on problems with a large number of constraints.

RIS


TY  - CPAPER
TI  - A Light Touch for Heavily Constrained SGD
AU  - Andrew Cotter
AU  - Maya Gupta
AU  - Jan Pfeifer
BT  - 29th Annual Conference on Learning Theory
DA  - 2016/06/06
ED  - Vitaly Feldman
ED  - Alexander Rakhlin
ED  - Ohad Shamir	
ID  - pmlr-v49-cotter16
PB  - PMLR
DP  - Proceedings of Machine Learning Research
VL  - 49
SP  - 729
EP  - 771
L1  - http://proceedings.mlr.press/v49/cotter16.pdf
UR  - https://proceedings.mlr.press/v49/cotter16.html
AB  - Minimizing empirical risk subject to a set of constraints can be a useful strategy for learning restricted classes of functions, such as monotonic functions, submodular functions, classifiers that guarantee a certain class label for some subset of examples, etc. However, these restrictions may result in a very large number of constraints. Projected stochastic gradient descent (SGD) is often the default choice for large-scale optimization in machine learning, but requires a projection after each update. For heavily-constrained objectives, we propose an efficient extension of SGD that stays close to the feasible region while only applying constraints probabilistically at each iteration. Theoretical analysis shows a compelling trade-off between per-iteration work and the number of iterations needed on problems with a large number of constraints.
ER  -

APA


Cotter, A., Gupta, M. & Pfeifer, J.. (2016). A Light Touch for Heavily Constrained SGD. 29th Annual Conference on Learning Theory, in Proceedings of Machine Learning Research 49:729-771 Available from https://proceedings.mlr.press/v49/cotter16.html.

Related Material

Download PDF