Using More Data to Speed-up Training Time

Shai Shalev-Shwartz; Ohad Shamir; Eran Tromer

Using More Data to Speed-up Training Time

Shai Shalev-Shwartz, Ohad Shamir, Eran Tromer

Proceedings of the Fifteenth International Conference on Artificial Intelligence and Statistics, PMLR 22:1019-1027, 2012.

Abstract

In many recent applications, data is plentiful. By now, we have a rather clear understanding of how more data can be used to improve the accuracy of learning algorithms. Recently, there has been a growing interest in understanding how more data can be leveraged to reduce the required training runtime. In this paper, we study the runtime of learning as a function of the number of available training examples, and underscore the main high-level techniques. We provide the first formal positive result showing that even in the unrealizable case, the runtime can decrease exponentially while only requiring a polynomial growth of the number of examples. Our construction corresponds to a synthetic learning problem and an interesting open question is whether the tradeoff can be shown for more natural learning problems. We spell out several interesting candidates of natural learning problems for which we conjecture that there is a tradeoff between computational and sample complexity.

Cite this Paper

BibTeX


@InProceedings{pmlr-v22-shalev-shwartz12,
  title = 	 {Using More Data to Speed-up Training Time},
  author = 	 {Shalev-Shwartz, Shai and Shamir, Ohad and Tromer, Eran},
  booktitle = 	 {Proceedings of the Fifteenth International Conference on Artificial Intelligence and Statistics},
  pages = 	 {1019--1027},
  year = 	 {2012},
  editor = 	 {Lawrence, Neil D. and Girolami, Mark},
  volume = 	 {22},
  series = 	 {Proceedings of Machine Learning Research},
  address = 	 {La Palma, Canary Islands},
  month = 	 {21--23 Apr},
  publisher =    {PMLR},
  pdf = 	 {http://proceedings.mlr.press/v22/shalev-shwartz12/shalev-shwartz12.pdf},
  url = 	 {https://proceedings.mlr.press/v22/shalev-shwartz12.html},
  abstract = 	 {In many recent applications, data is plentiful. By now, we have a rather clear understanding of how more data can be used to improve the accuracy of learning algorithms. Recently, there has been a growing interest in understanding how more data can be leveraged to reduce the required training runtime. In this paper, we study the runtime of learning as a function of the number of available training examples, and underscore the main high-level techniques. We provide the first formal positive result showing that even in the unrealizable case, the runtime can decrease exponentially while only  requiring a polynomial growth of the number of examples. Our construction corresponds to a synthetic learning problem and an interesting open question is whether the tradeoff can be shown for more natural learning problems. We spell out several interesting candidates of natural learning problems for which we conjecture that there is a tradeoff between computational and sample complexity.}
}

Endnote

%0 Conference Paper
%T Using More Data to Speed-up Training Time
%A Shai Shalev-Shwartz
%A Ohad Shamir
%A Eran Tromer
%B Proceedings of the Fifteenth International Conference on Artificial Intelligence and Statistics
%C Proceedings of Machine Learning Research
%D 2012
%E Neil D. Lawrence
%E Mark Girolami	
%F pmlr-v22-shalev-shwartz12
%I PMLR
%P 1019--1027
%U https://proceedings.mlr.press/v22/shalev-shwartz12.html
%V 22
%X In many recent applications, data is plentiful. By now, we have a rather clear understanding of how more data can be used to improve the accuracy of learning algorithms. Recently, there has been a growing interest in understanding how more data can be leveraged to reduce the required training runtime. In this paper, we study the runtime of learning as a function of the number of available training examples, and underscore the main high-level techniques. We provide the first formal positive result showing that even in the unrealizable case, the runtime can decrease exponentially while only  requiring a polynomial growth of the number of examples. Our construction corresponds to a synthetic learning problem and an interesting open question is whether the tradeoff can be shown for more natural learning problems. We spell out several interesting candidates of natural learning problems for which we conjecture that there is a tradeoff between computational and sample complexity.

RIS


TY  - CPAPER
TI  - Using More Data to Speed-up Training Time
AU  - Shai Shalev-Shwartz
AU  - Ohad Shamir
AU  - Eran Tromer
BT  - Proceedings of the Fifteenth International Conference on Artificial Intelligence and Statistics
DA  - 2012/03/21
ED  - Neil D. Lawrence
ED  - Mark Girolami	
ID  - pmlr-v22-shalev-shwartz12
PB  - PMLR
DP  - Proceedings of Machine Learning Research
VL  - 22
SP  - 1019
EP  - 1027
L1  - http://proceedings.mlr.press/v22/shalev-shwartz12/shalev-shwartz12.pdf
UR  - https://proceedings.mlr.press/v22/shalev-shwartz12.html
AB  - In many recent applications, data is plentiful. By now, we have a rather clear understanding of how more data can be used to improve the accuracy of learning algorithms. Recently, there has been a growing interest in understanding how more data can be leveraged to reduce the required training runtime. In this paper, we study the runtime of learning as a function of the number of available training examples, and underscore the main high-level techniques. We provide the first formal positive result showing that even in the unrealizable case, the runtime can decrease exponentially while only  requiring a polynomial growth of the number of examples. Our construction corresponds to a synthetic learning problem and an interesting open question is whether the tradeoff can be shown for more natural learning problems. We spell out several interesting candidates of natural learning problems for which we conjecture that there is a tradeoff between computational and sample complexity.
ER  -

APA


Shalev-Shwartz, S., Shamir, O. & Tromer, E.. (2012). Using More Data to Speed-up Training Time. Proceedings of the Fifteenth International Conference on Artificial Intelligence and Statistics, in Proceedings of Machine Learning Research 22:1019-1027 Available from https://proceedings.mlr.press/v22/shalev-shwartz12.html.

Using More Data to Speed-up Training Time

Abstract

Cite this Paper

Related Material