Deploying high throughput predictive models with the actor framework

Brian Gawalt
Proceedings of The 2nd International Conference on Predictive APIs and Apps, PMLR 50:15-28, 2016.

Abstract

The majority of data science and machine learning tutorials focus on generating models: assembling a dataset; splitting the data into training, validation, and testing subsets; building the model; and demonstrating its generalizability. But when it’s time to repeat the analogous steps when using the model in production, issues of high latency or low throughput can arise. To an end user, the cost of too much time spent featurizing raw data and evaluating a model over features can wind up erasing any gains a smarter prediction can offer. Exposing concurrency in these model-usage steps, and then capitalizing on that concurrency, can improve throughput. This paper describes how the actor framework can be used to bring a predictive model to a real-time setting. Two case-study examples are described: a live deployment built for the freelancing platform Upwork, a simple text classifier with accompanying code for use as an introductory project.

Cite this Paper


BibTeX
@InProceedings{pmlr-v50-gawalt15, title = {Deploying high throughput predictive models with the actor framework}, author = {Gawalt, Brian}, booktitle = {Proceedings of The 2nd International Conference on Predictive APIs and Apps}, pages = {15--28}, year = {2016}, editor = {Dorard, Louis and Reid, Mark D. and Martin, Francisco J.}, volume = {50}, series = {Proceedings of Machine Learning Research}, address = {Sydney, Australia}, month = {06--07 Aug}, publisher = {PMLR}, pdf = {http://proceedings.mlr.press/v50/gawalt15.pdf}, url = {https://proceedings.mlr.press/v50/gawalt15.html}, abstract = {The majority of data science and machine learning tutorials focus on generating models: assembling a dataset; splitting the data into training, validation, and testing subsets; building the model; and demonstrating its generalizability. But when it’s time to repeat the analogous steps when using the model in production, issues of high latency or low throughput can arise. To an end user, the cost of too much time spent featurizing raw data and evaluating a model over features can wind up erasing any gains a smarter prediction can offer. Exposing concurrency in these model-usage steps, and then capitalizing on that concurrency, can improve throughput. This paper describes how the actor framework can be used to bring a predictive model to a real-time setting. Two case-study examples are described: a live deployment built for the freelancing platform Upwork, a simple text classifier with accompanying code for use as an introductory project.} }
Endnote
%0 Conference Paper %T Deploying high throughput predictive models with the actor framework %A Brian Gawalt %B Proceedings of The 2nd International Conference on Predictive APIs and Apps %C Proceedings of Machine Learning Research %D 2016 %E Louis Dorard %E Mark D. Reid %E Francisco J. Martin %F pmlr-v50-gawalt15 %I PMLR %P 15--28 %U https://proceedings.mlr.press/v50/gawalt15.html %V 50 %X The majority of data science and machine learning tutorials focus on generating models: assembling a dataset; splitting the data into training, validation, and testing subsets; building the model; and demonstrating its generalizability. But when it’s time to repeat the analogous steps when using the model in production, issues of high latency or low throughput can arise. To an end user, the cost of too much time spent featurizing raw data and evaluating a model over features can wind up erasing any gains a smarter prediction can offer. Exposing concurrency in these model-usage steps, and then capitalizing on that concurrency, can improve throughput. This paper describes how the actor framework can be used to bring a predictive model to a real-time setting. Two case-study examples are described: a live deployment built for the freelancing platform Upwork, a simple text classifier with accompanying code for use as an introductory project.
RIS
TY - CPAPER TI - Deploying high throughput predictive models with the actor framework AU - Brian Gawalt BT - Proceedings of The 2nd International Conference on Predictive APIs and Apps DA - 2016/06/05 ED - Louis Dorard ED - Mark D. Reid ED - Francisco J. Martin ID - pmlr-v50-gawalt15 PB - PMLR DP - Proceedings of Machine Learning Research VL - 50 SP - 15 EP - 28 L1 - http://proceedings.mlr.press/v50/gawalt15.pdf UR - https://proceedings.mlr.press/v50/gawalt15.html AB - The majority of data science and machine learning tutorials focus on generating models: assembling a dataset; splitting the data into training, validation, and testing subsets; building the model; and demonstrating its generalizability. But when it’s time to repeat the analogous steps when using the model in production, issues of high latency or low throughput can arise. To an end user, the cost of too much time spent featurizing raw data and evaluating a model over features can wind up erasing any gains a smarter prediction can offer. Exposing concurrency in these model-usage steps, and then capitalizing on that concurrency, can improve throughput. This paper describes how the actor framework can be used to bring a predictive model to a real-time setting. Two case-study examples are described: a live deployment built for the freelancing platform Upwork, a simple text classifier with accompanying code for use as an introductory project. ER -
APA
Gawalt, B.. (2016). Deploying high throughput predictive models with the actor framework. Proceedings of The 2nd International Conference on Predictive APIs and Apps, in Proceedings of Machine Learning Research 50:15-28 Available from https://proceedings.mlr.press/v50/gawalt15.html.

Related Material