Boosting-Based Reliable Model Reuse

Yao-Xiang Ding, Zhi-Hua Zhou
Proceedings of The 12th Asian Conference on Machine Learning, PMLR 129:145-160, 2020.

Abstract

We study the following model reuse problem: a learner needs to select a subset of models from a model pool to classify an unlabeled dataset without accessing the raw training data of the models. Under this situation, it is challenging to properly estimate the reusability of the models in the pool. In this work, we consider the model reuse protocol under which the learner receives specifications of the models, including reusability indicators to verify the models’ prediction accuracy on any unlabeled instances. We propose MoreBoost, a simple yet powerful boosting algorithm to achieve effective model reuse under the idealized assumption that the reusability indicators are noise-free. When the reusability indicators are noisy, we strengthen MoreBoost with an active rectification mechanism, allowing the learner to query ground-truth indicator values from the model providers actively. The resulted MoreBoost.AR algorithm is guaranteed to significantly reduce the prediction error caused by the indicator noise. We also conduct experiments on both synthetic and benchmark datasets to verify the performance of the proposed approaches.

Cite this Paper


BibTeX
@InProceedings{pmlr-v129-ding20a, title = {Boosting-Based Reliable Model Reuse}, author = {Ding, Yao-Xiang and Zhou, Zhi-Hua}, booktitle = {Proceedings of The 12th Asian Conference on Machine Learning}, pages = {145--160}, year = {2020}, editor = {Pan, Sinno Jialin and Sugiyama, Masashi}, volume = {129}, series = {Proceedings of Machine Learning Research}, month = {18--20 Nov}, publisher = {PMLR}, pdf = {http://proceedings.mlr.press/v129/ding20a/ding20a.pdf}, url = {https://proceedings.mlr.press/v129/ding20a.html}, abstract = {We study the following model reuse problem: a learner needs to select a subset of models from a model pool to classify an unlabeled dataset without accessing the raw training data of the models. Under this situation, it is challenging to properly estimate the reusability of the models in the pool. In this work, we consider the model reuse protocol under which the learner receives specifications of the models, including reusability indicators to verify the models’ prediction accuracy on any unlabeled instances. We propose MoreBoost, a simple yet powerful boosting algorithm to achieve effective model reuse under the idealized assumption that the reusability indicators are noise-free. When the reusability indicators are noisy, we strengthen MoreBoost with an active rectification mechanism, allowing the learner to query ground-truth indicator values from the model providers actively. The resulted MoreBoost.AR algorithm is guaranteed to significantly reduce the prediction error caused by the indicator noise. We also conduct experiments on both synthetic and benchmark datasets to verify the performance of the proposed approaches.} }
Endnote
%0 Conference Paper %T Boosting-Based Reliable Model Reuse %A Yao-Xiang Ding %A Zhi-Hua Zhou %B Proceedings of The 12th Asian Conference on Machine Learning %C Proceedings of Machine Learning Research %D 2020 %E Sinno Jialin Pan %E Masashi Sugiyama %F pmlr-v129-ding20a %I PMLR %P 145--160 %U https://proceedings.mlr.press/v129/ding20a.html %V 129 %X We study the following model reuse problem: a learner needs to select a subset of models from a model pool to classify an unlabeled dataset without accessing the raw training data of the models. Under this situation, it is challenging to properly estimate the reusability of the models in the pool. In this work, we consider the model reuse protocol under which the learner receives specifications of the models, including reusability indicators to verify the models’ prediction accuracy on any unlabeled instances. We propose MoreBoost, a simple yet powerful boosting algorithm to achieve effective model reuse under the idealized assumption that the reusability indicators are noise-free. When the reusability indicators are noisy, we strengthen MoreBoost with an active rectification mechanism, allowing the learner to query ground-truth indicator values from the model providers actively. The resulted MoreBoost.AR algorithm is guaranteed to significantly reduce the prediction error caused by the indicator noise. We also conduct experiments on both synthetic and benchmark datasets to verify the performance of the proposed approaches.
APA
Ding, Y. & Zhou, Z.. (2020). Boosting-Based Reliable Model Reuse. Proceedings of The 12th Asian Conference on Machine Learning, in Proceedings of Machine Learning Research 129:145-160 Available from https://proceedings.mlr.press/v129/ding20a.html.

Related Material