MLDemon:Deployment Monitoring for Machine Learning Systems

Tony Ginart, Martin Jinye Zhang, James Zou
Proceedings of The 25th International Conference on Artificial Intelligence and Statistics, PMLR 151:3962-3997, 2022.

Abstract

Post-deployment monitoring of ML systems is critical for ensuring reliability, especially as new user inputs can differ from the training distribution. Here we propose a novel approach, MLDemon, for ML DEployment MONitoring. MLDemon integrates both unlabeled data and a small amount of on-demand labels to produce a real-time estimate of the ML model’s current performance on a given data stream. Subject to budget constraints, MLDemon decides when to acquire additional, potentially costly, expert supervised labels to verify the model. On temporal datasets with diverse distribution drifts and models, MLDemon outperforms existing approaches. Moreover, we provide theoretical analysis to show that MLDemon is minimax rate optimal for a broad class of distribution drifts.

Cite this Paper


BibTeX
@InProceedings{pmlr-v151-ginart22a, title = { MLDemon:Deployment Monitoring for Machine Learning Systems }, author = {Ginart, Tony and Jinye Zhang, Martin and Zou, James}, booktitle = {Proceedings of The 25th International Conference on Artificial Intelligence and Statistics}, pages = {3962--3997}, year = {2022}, editor = {Camps-Valls, Gustau and Ruiz, Francisco J. R. and Valera, Isabel}, volume = {151}, series = {Proceedings of Machine Learning Research}, month = {28--30 Mar}, publisher = {PMLR}, pdf = {https://proceedings.mlr.press/v151/ginart22a/ginart22a.pdf}, url = {https://proceedings.mlr.press/v151/ginart22a.html}, abstract = { Post-deployment monitoring of ML systems is critical for ensuring reliability, especially as new user inputs can differ from the training distribution. Here we propose a novel approach, MLDemon, for ML DEployment MONitoring. MLDemon integrates both unlabeled data and a small amount of on-demand labels to produce a real-time estimate of the ML model’s current performance on a given data stream. Subject to budget constraints, MLDemon decides when to acquire additional, potentially costly, expert supervised labels to verify the model. On temporal datasets with diverse distribution drifts and models, MLDemon outperforms existing approaches. Moreover, we provide theoretical analysis to show that MLDemon is minimax rate optimal for a broad class of distribution drifts. } }
Endnote
%0 Conference Paper %T MLDemon:Deployment Monitoring for Machine Learning Systems %A Tony Ginart %A Martin Jinye Zhang %A James Zou %B Proceedings of The 25th International Conference on Artificial Intelligence and Statistics %C Proceedings of Machine Learning Research %D 2022 %E Gustau Camps-Valls %E Francisco J. R. Ruiz %E Isabel Valera %F pmlr-v151-ginart22a %I PMLR %P 3962--3997 %U https://proceedings.mlr.press/v151/ginart22a.html %V 151 %X Post-deployment monitoring of ML systems is critical for ensuring reliability, especially as new user inputs can differ from the training distribution. Here we propose a novel approach, MLDemon, for ML DEployment MONitoring. MLDemon integrates both unlabeled data and a small amount of on-demand labels to produce a real-time estimate of the ML model’s current performance on a given data stream. Subject to budget constraints, MLDemon decides when to acquire additional, potentially costly, expert supervised labels to verify the model. On temporal datasets with diverse distribution drifts and models, MLDemon outperforms existing approaches. Moreover, we provide theoretical analysis to show that MLDemon is minimax rate optimal for a broad class of distribution drifts.
APA
Ginart, T., Jinye Zhang, M. & Zou, J.. (2022). MLDemon:Deployment Monitoring for Machine Learning Systems . Proceedings of The 25th International Conference on Artificial Intelligence and Statistics, in Proceedings of Machine Learning Research 151:3962-3997 Available from https://proceedings.mlr.press/v151/ginart22a.html.

Related Material