Deploying high throughput predictive models with the actor framework

Brian Gawalt

Deploying high throughput predictive models with the actor framework

Brian Gawalt

Proceedings of The 2nd International Conference on Predictive APIs and Apps, PMLR 50:15-28, 2016.

Abstract

The majority of data science and machine learning tutorials focus on generating models: assembling a dataset; splitting the data into training, validation, and testing subsets; building the model; and demonstrating its generalizability. But when it’s time to repeat the analogous steps when using the model in production, issues of high latency or low throughput can arise. To an end user, the cost of too much time spent featurizing raw data and evaluating a model over features can wind up erasing any gains a smarter prediction can offer. Exposing concurrency in these model-usage steps, and then capitalizing on that concurrency, can improve throughput. This paper describes how the actor framework can be used to bring a predictive model to a real-time setting. Two case-study examples are described: a live deployment built for the freelancing platform Upwork, a simple text classifier with accompanying code for use as an introductory project.

Cite this Paper

BibTeX


@InProceedings{pmlr-v50-gawalt15,
  title = 	 {Deploying high throughput predictive models with the actor framework},
  author = 	 {Gawalt, Brian},
  booktitle = 	 {Proceedings of The 2nd International Conference on Predictive APIs and Apps},
  pages = 	 {15--28},
  year = 	 {2016},
  editor = 	 {Dorard, Louis and Reid, Mark D. and Martin, Francisco J.},
  volume = 	 {50},
  series = 	 {Proceedings of Machine Learning Research},
  address = 	 {Sydney, Australia},
  month = 	 {06--07 Aug},
  publisher =    {PMLR},
  pdf = 	 {http://proceedings.mlr.press/v50/gawalt15.pdf},
  url = 	 {https://proceedings.mlr.press/v50/gawalt15.html},
  abstract = 	 {The majority of data science and machine learning tutorials focus on generating models: assembling a dataset; splitting the data into training, validation, and testing subsets; building the model; and demonstrating its generalizability. But when it’s time to repeat the analogous steps when using the model in production, issues of high latency or low throughput can arise. To an end user, the cost of too much time spent featurizing raw data and evaluating a model over features can wind up erasing any gains a smarter prediction can offer. Exposing concurrency in these model-usage steps, and then capitalizing on that concurrency, can improve throughput. This paper describes how the actor framework can be used to bring a predictive model to a real-time setting. Two case-study examples are described: a live deployment built for the freelancing platform Upwork, a simple text classifier with accompanying code for use as an introductory project.}
}

Endnote

%0 Conference Paper
%T Deploying high throughput predictive models with the actor framework
%A Brian Gawalt
%B Proceedings of The 2nd International Conference on Predictive APIs and Apps
%C Proceedings of Machine Learning Research
%D 2016
%E Louis Dorard
%E Mark D. Reid
%E Francisco J. Martin	
%F pmlr-v50-gawalt15
%I PMLR
%P 15--28
%U https://proceedings.mlr.press/v50/gawalt15.html
%V 50
%X The majority of data science and machine learning tutorials focus on generating models: assembling a dataset; splitting the data into training, validation, and testing subsets; building the model; and demonstrating its generalizability. But when it’s time to repeat the analogous steps when using the model in production, issues of high latency or low throughput can arise. To an end user, the cost of too much time spent featurizing raw data and evaluating a model over features can wind up erasing any gains a smarter prediction can offer. Exposing concurrency in these model-usage steps, and then capitalizing on that concurrency, can improve throughput. This paper describes how the actor framework can be used to bring a predictive model to a real-time setting. Two case-study examples are described: a live deployment built for the freelancing platform Upwork, a simple text classifier with accompanying code for use as an introductory project.

RIS


TY  - CPAPER
TI  - Deploying high throughput predictive models with the actor framework
AU  - Brian Gawalt
BT  - Proceedings of The 2nd International Conference on Predictive APIs and Apps
DA  - 2016/06/05
ED  - Louis Dorard
ED  - Mark D. Reid
ED  - Francisco J. Martin	
ID  - pmlr-v50-gawalt15
PB  - PMLR
DP  - Proceedings of Machine Learning Research
VL  - 50
SP  - 15
EP  - 28
L1  - http://proceedings.mlr.press/v50/gawalt15.pdf
UR  - https://proceedings.mlr.press/v50/gawalt15.html
AB  - The majority of data science and machine learning tutorials focus on generating models: assembling a dataset; splitting the data into training, validation, and testing subsets; building the model; and demonstrating its generalizability. But when it’s time to repeat the analogous steps when using the model in production, issues of high latency or low throughput can arise. To an end user, the cost of too much time spent featurizing raw data and evaluating a model over features can wind up erasing any gains a smarter prediction can offer. Exposing concurrency in these model-usage steps, and then capitalizing on that concurrency, can improve throughput. This paper describes how the actor framework can be used to bring a predictive model to a real-time setting. Two case-study examples are described: a live deployment built for the freelancing platform Upwork, a simple text classifier with accompanying code for use as an introductory project.
ER  -

APA


Gawalt, B.. (2016). Deploying high throughput predictive models with the actor framework. Proceedings of The 2nd International Conference on Predictive APIs and Apps, in Proceedings of Machine Learning Research 50:15-28 Available from https://proceedings.mlr.press/v50/gawalt15.html.

Related Material

Download PDF