Learning to Rank for Active Learning via Multi-Task Bilevel Optimization

Zixin Ding, Si Chen, Ruoxi Jia, Yuxin Chen
Proceedings of the Fortieth Conference on Uncertainty in Artificial Intelligence, PMLR 244:1112-1128, 2024.

Abstract

Active learning is a promising paradigm for reducing labeling costs by strategically requesting labels to improve model performance. However, existing active learning methods often rely on expensive acquisition functions, extensive model retraining, and multiple rounds of interaction with annotators. To address these limitations, we propose a novel approach for active learning, which aims to select batches of unlabeled instances through a learned surrogate model for data acquisition. A key challenge in this approach is to develop an acquisition function that generalizes well, as the history of data, which forms part of the utility function’s input, grows over time. Our novel algorithmic contribution is a multi-task bilevel optimization framework that predicts the relative utility—measured by the validation accuracy—of different training sets, and ensures the learned acquisition function generalizes effectively. For cases where validation accuracy is expensive to evaluate, we introduce efficient interpolation-based surrogate models to estimate the utility function, reducing the evaluation cost. We demonstrate the performance of our approach through extensive experiments on standard active classification benchmarks.

Cite this Paper


BibTeX
@InProceedings{pmlr-v244-ding24a, title = {Learning to Rank for Active Learning via Multi-Task Bilevel Optimization}, author = {Ding, Zixin and Chen, Si and Jia, Ruoxi and Chen, Yuxin}, booktitle = {Proceedings of the Fortieth Conference on Uncertainty in Artificial Intelligence}, pages = {1112--1128}, year = {2024}, editor = {Kiyavash, Negar and Mooij, Joris M.}, volume = {244}, series = {Proceedings of Machine Learning Research}, month = {15--19 Jul}, publisher = {PMLR}, pdf = {https://raw.githubusercontent.com/mlresearch/v244/main/assets/ding24a/ding24a.pdf}, url = {https://proceedings.mlr.press/v244/ding24a.html}, abstract = {Active learning is a promising paradigm for reducing labeling costs by strategically requesting labels to improve model performance. However, existing active learning methods often rely on expensive acquisition functions, extensive model retraining, and multiple rounds of interaction with annotators. To address these limitations, we propose a novel approach for active learning, which aims to select batches of unlabeled instances through a learned surrogate model for data acquisition. A key challenge in this approach is to develop an acquisition function that generalizes well, as the history of data, which forms part of the utility function’s input, grows over time. Our novel algorithmic contribution is a multi-task bilevel optimization framework that predicts the relative utility—measured by the validation accuracy—of different training sets, and ensures the learned acquisition function generalizes effectively. For cases where validation accuracy is expensive to evaluate, we introduce efficient interpolation-based surrogate models to estimate the utility function, reducing the evaluation cost. We demonstrate the performance of our approach through extensive experiments on standard active classification benchmarks.} }
Endnote
%0 Conference Paper %T Learning to Rank for Active Learning via Multi-Task Bilevel Optimization %A Zixin Ding %A Si Chen %A Ruoxi Jia %A Yuxin Chen %B Proceedings of the Fortieth Conference on Uncertainty in Artificial Intelligence %C Proceedings of Machine Learning Research %D 2024 %E Negar Kiyavash %E Joris M. Mooij %F pmlr-v244-ding24a %I PMLR %P 1112--1128 %U https://proceedings.mlr.press/v244/ding24a.html %V 244 %X Active learning is a promising paradigm for reducing labeling costs by strategically requesting labels to improve model performance. However, existing active learning methods often rely on expensive acquisition functions, extensive model retraining, and multiple rounds of interaction with annotators. To address these limitations, we propose a novel approach for active learning, which aims to select batches of unlabeled instances through a learned surrogate model for data acquisition. A key challenge in this approach is to develop an acquisition function that generalizes well, as the history of data, which forms part of the utility function’s input, grows over time. Our novel algorithmic contribution is a multi-task bilevel optimization framework that predicts the relative utility—measured by the validation accuracy—of different training sets, and ensures the learned acquisition function generalizes effectively. For cases where validation accuracy is expensive to evaluate, we introduce efficient interpolation-based surrogate models to estimate the utility function, reducing the evaluation cost. We demonstrate the performance of our approach through extensive experiments on standard active classification benchmarks.
APA
Ding, Z., Chen, S., Jia, R. & Chen, Y.. (2024). Learning to Rank for Active Learning via Multi-Task Bilevel Optimization. Proceedings of the Fortieth Conference on Uncertainty in Artificial Intelligence, in Proceedings of Machine Learning Research 244:1112-1128 Available from https://proceedings.mlr.press/v244/ding24a.html.

Related Material