Zero-shot AutoML with Pretrained Models

Ekrem Öztürk, Fabio Ferreira, Hadi Jomaa, Lars Schmidt-Thieme, Josif Grabocka, Frank Hutter
Proceedings of the 39th International Conference on Machine Learning, PMLR 162:17138-17155, 2022.

Abstract

Given a new dataset D and a low compute budget, how should we choose a pre-trained model to fine-tune to D, and set the fine-tuning hyperparameters without risking overfitting, particularly if D is small? Here, we extend automated machine learning (AutoML) to best make these choices. Our domain-independent meta-learning approach learns a zero-shot surrogate model which, at test time, allows to select the right deep learning (DL) pipeline (including the pre-trained model and fine-tuning hyperparameters) for a new dataset D given only trivial meta-features describing D such as image resolution or the number of classes. To train this zero-shot model, we collect performance data for many DL pipelines on a large collection of datasets and meta-train on this data to minimize a pairwise ranking objective. We evaluate our approach under the strict time limit of the vision track of the ChaLearn AutoDL challenge benchmark, clearly outperforming all challenge contenders.

Cite this Paper


BibTeX
@InProceedings{pmlr-v162-ozturk22a, title = {Zero-shot {A}uto{ML} with Pretrained Models}, author = {{\"O}zt{\"u}rk, Ekrem and Ferreira, Fabio and Jomaa, Hadi and Schmidt-Thieme, Lars and Grabocka, Josif and Hutter, Frank}, booktitle = {Proceedings of the 39th International Conference on Machine Learning}, pages = {17138--17155}, year = {2022}, editor = {Chaudhuri, Kamalika and Jegelka, Stefanie and Song, Le and Szepesvari, Csaba and Niu, Gang and Sabato, Sivan}, volume = {162}, series = {Proceedings of Machine Learning Research}, month = {17--23 Jul}, publisher = {PMLR}, pdf = {https://proceedings.mlr.press/v162/ozturk22a/ozturk22a.pdf}, url = {https://proceedings.mlr.press/v162/ozturk22a.html}, abstract = {Given a new dataset D and a low compute budget, how should we choose a pre-trained model to fine-tune to D, and set the fine-tuning hyperparameters without risking overfitting, particularly if D is small? Here, we extend automated machine learning (AutoML) to best make these choices. Our domain-independent meta-learning approach learns a zero-shot surrogate model which, at test time, allows to select the right deep learning (DL) pipeline (including the pre-trained model and fine-tuning hyperparameters) for a new dataset D given only trivial meta-features describing D such as image resolution or the number of classes. To train this zero-shot model, we collect performance data for many DL pipelines on a large collection of datasets and meta-train on this data to minimize a pairwise ranking objective. We evaluate our approach under the strict time limit of the vision track of the ChaLearn AutoDL challenge benchmark, clearly outperforming all challenge contenders.} }
Endnote
%0 Conference Paper %T Zero-shot AutoML with Pretrained Models %A Ekrem Öztürk %A Fabio Ferreira %A Hadi Jomaa %A Lars Schmidt-Thieme %A Josif Grabocka %A Frank Hutter %B Proceedings of the 39th International Conference on Machine Learning %C Proceedings of Machine Learning Research %D 2022 %E Kamalika Chaudhuri %E Stefanie Jegelka %E Le Song %E Csaba Szepesvari %E Gang Niu %E Sivan Sabato %F pmlr-v162-ozturk22a %I PMLR %P 17138--17155 %U https://proceedings.mlr.press/v162/ozturk22a.html %V 162 %X Given a new dataset D and a low compute budget, how should we choose a pre-trained model to fine-tune to D, and set the fine-tuning hyperparameters without risking overfitting, particularly if D is small? Here, we extend automated machine learning (AutoML) to best make these choices. Our domain-independent meta-learning approach learns a zero-shot surrogate model which, at test time, allows to select the right deep learning (DL) pipeline (including the pre-trained model and fine-tuning hyperparameters) for a new dataset D given only trivial meta-features describing D such as image resolution or the number of classes. To train this zero-shot model, we collect performance data for many DL pipelines on a large collection of datasets and meta-train on this data to minimize a pairwise ranking objective. We evaluate our approach under the strict time limit of the vision track of the ChaLearn AutoDL challenge benchmark, clearly outperforming all challenge contenders.
APA
Öztürk, E., Ferreira, F., Jomaa, H., Schmidt-Thieme, L., Grabocka, J. & Hutter, F.. (2022). Zero-shot AutoML with Pretrained Models. Proceedings of the 39th International Conference on Machine Learning, in Proceedings of Machine Learning Research 162:17138-17155 Available from https://proceedings.mlr.press/v162/ozturk22a.html.

Related Material