Efficient Online ML API Selection for Multi-Label Classification Tasks

Lingjiao Chen, Matei Zaharia, James Zou
Proceedings of the 39th International Conference on Machine Learning, PMLR 162:3716-3746, 2022.

Abstract

Multi-label classification tasks such as OCR and multi-object recognition are a major focus of the growing machine learning as a service industry. While many multi-label APIs are available, it is challenging for users to decide which API to use for their own data and budget, due to the heterogeneity in their prices and performance. Recent work has shown how to efficiently select and combine single label APIs to optimize performance and cost. However, its computation cost is exponential in the number of labels, and is not suitable for settings like OCR. In this work, we propose FrugalMCT, a principled framework that adaptively selects the APIs to use for different data in an online fashion while respecting the user’s budget. It allows combining ML APIs’ predictions for any single data point, and selects the best combination based on an accuracy estimator. We run systematic experiments using ML APIs from Google, Microsoft, Amazon, IBM, Tencent, and other providers for tasks including multi-label image classification, scene text recognition, and named entity recognition. Across these tasks, FrugalMCT can achieve over 90% cost reduction while matching the accuracy of the best single API, or up to 8% better accuracy while matching the best API’s cost.

Cite this Paper


BibTeX
@InProceedings{pmlr-v162-chen22ad, title = {Efficient Online {ML} {API} Selection for Multi-Label Classification Tasks}, author = {Chen, Lingjiao and Zaharia, Matei and Zou, James}, booktitle = {Proceedings of the 39th International Conference on Machine Learning}, pages = {3716--3746}, year = {2022}, editor = {Chaudhuri, Kamalika and Jegelka, Stefanie and Song, Le and Szepesvari, Csaba and Niu, Gang and Sabato, Sivan}, volume = {162}, series = {Proceedings of Machine Learning Research}, month = {17--23 Jul}, publisher = {PMLR}, pdf = {https://proceedings.mlr.press/v162/chen22ad/chen22ad.pdf}, url = {https://proceedings.mlr.press/v162/chen22ad.html}, abstract = {Multi-label classification tasks such as OCR and multi-object recognition are a major focus of the growing machine learning as a service industry. While many multi-label APIs are available, it is challenging for users to decide which API to use for their own data and budget, due to the heterogeneity in their prices and performance. Recent work has shown how to efficiently select and combine single label APIs to optimize performance and cost. However, its computation cost is exponential in the number of labels, and is not suitable for settings like OCR. In this work, we propose FrugalMCT, a principled framework that adaptively selects the APIs to use for different data in an online fashion while respecting the user’s budget. It allows combining ML APIs’ predictions for any single data point, and selects the best combination based on an accuracy estimator. We run systematic experiments using ML APIs from Google, Microsoft, Amazon, IBM, Tencent, and other providers for tasks including multi-label image classification, scene text recognition, and named entity recognition. Across these tasks, FrugalMCT can achieve over 90% cost reduction while matching the accuracy of the best single API, or up to 8% better accuracy while matching the best API’s cost.} }
Endnote
%0 Conference Paper %T Efficient Online ML API Selection for Multi-Label Classification Tasks %A Lingjiao Chen %A Matei Zaharia %A James Zou %B Proceedings of the 39th International Conference on Machine Learning %C Proceedings of Machine Learning Research %D 2022 %E Kamalika Chaudhuri %E Stefanie Jegelka %E Le Song %E Csaba Szepesvari %E Gang Niu %E Sivan Sabato %F pmlr-v162-chen22ad %I PMLR %P 3716--3746 %U https://proceedings.mlr.press/v162/chen22ad.html %V 162 %X Multi-label classification tasks such as OCR and multi-object recognition are a major focus of the growing machine learning as a service industry. While many multi-label APIs are available, it is challenging for users to decide which API to use for their own data and budget, due to the heterogeneity in their prices and performance. Recent work has shown how to efficiently select and combine single label APIs to optimize performance and cost. However, its computation cost is exponential in the number of labels, and is not suitable for settings like OCR. In this work, we propose FrugalMCT, a principled framework that adaptively selects the APIs to use for different data in an online fashion while respecting the user’s budget. It allows combining ML APIs’ predictions for any single data point, and selects the best combination based on an accuracy estimator. We run systematic experiments using ML APIs from Google, Microsoft, Amazon, IBM, Tencent, and other providers for tasks including multi-label image classification, scene text recognition, and named entity recognition. Across these tasks, FrugalMCT can achieve over 90% cost reduction while matching the accuracy of the best single API, or up to 8% better accuracy while matching the best API’s cost.
APA
Chen, L., Zaharia, M. & Zou, J.. (2022). Efficient Online ML API Selection for Multi-Label Classification Tasks. Proceedings of the 39th International Conference on Machine Learning, in Proceedings of Machine Learning Research 162:3716-3746 Available from https://proceedings.mlr.press/v162/chen22ad.html.

Related Material