Scaling Machine Learning as a Service


Li Erran Li, Eric Chen, Jeremy Hermann, Pusheng Zhang, Luming Wang ;
Proceedings of The 3rd International Conference on Predictive Applications and APIs, PMLR 67:14-29, 2017.


Machine learning as a service (MLaaS) is imperative to the success of many companies as they need to gain business intelligence from big data. Building a scalable MLaaS for mission-critical and real-time applications is a very challenging problem. In this paper, we present the scalable MLaaS we built for Uber that operates globally. We focus on several scalability challenges. First, how to scale feature computation for many machine learning use cases. Second, how to build accurate models using global data and account for individual city or region characteristics. Third, how to enable scalable model deployment and real-time serving for hundreds of thousands of models across multiple data centers. Our technical solutions are the design and implementation of a scalable feature computing engine and feature store, a framework to manage and train a hierarchy of models as a single logical entity, and an automated one-click deployment system and scalable real-time serving service.

Related Material