[edit]
Boosting-Based Reliable Model Reuse
Proceedings of The 12th Asian Conference on Machine Learning, PMLR 129:145-160, 2020.
Abstract
We study the following model reuse problem: a learner needs to select a subset of models from a model pool to classify an unlabeled dataset without accessing the raw training data of the models. Under this situation, it is challenging to properly estimate the reusability of the models in the pool. In this work, we consider the model reuse protocol under which the learner receives specifications of the models, including reusability indicators to verify the models’ prediction accuracy on any unlabeled instances. We propose MoreBoost, a simple yet powerful boosting algorithm to achieve effective model reuse under the idealized assumption that the reusability indicators are noise-free. When the reusability indicators are noisy, we strengthen MoreBoost with an active rectification mechanism, allowing the learner to query ground-truth indicator values from the model providers actively. The resulted MoreBoost.AR algorithm is guaranteed to significantly reduce the prediction error caused by the indicator noise. We also conduct experiments on both synthetic and benchmark datasets to verify the performance of the proposed approaches.