Making Better Use of Unlabelled Data in Bayesian Active Learning

Freddie Bickford Smith; Adam Foster; Tom Rainforth

Making Better Use of Unlabelled Data in Bayesian Active Learning

Freddie Bickford Smith, Adam Foster, Tom Rainforth

Proceedings of The 27th International Conference on Artificial Intelligence and Statistics, PMLR 238:847-855, 2024.

Abstract

Fully supervised models are predominant in Bayesian active learning. We argue that their neglect of the information present in unlabelled data harms not just predictive performance but also decisions about what data to acquire. Our proposed solution is a simple framework for semi-supervised Bayesian active learning. We find it produces better-performing models than either conventional Bayesian active learning or semi-supervised learning with randomly acquired data. It is also easier to scale up than the conventional approach. As well as supporting a shift towards semi-supervised models, our findings highlight the importance of studying models and acquisition methods in conjunction.

Cite this Paper

BibTeX

@InProceedings{pmlr-v238-bickford-smith24a,
  title = 	 {Making Better Use of Unlabelled Data in {B}ayesian Active Learning},
  author =       {Bickford Smith, Freddie and Foster, Adam and Rainforth, Tom},
  booktitle = 	 {Proceedings of The 27th International Conference on Artificial Intelligence and Statistics},
  pages = 	 {847--855},
  year = 	 {2024},
  editor = 	 {Dasgupta, Sanjoy and Mandt, Stephan and Li, Yingzhen},
  volume = 	 {238},
  series = 	 {Proceedings of Machine Learning Research},
  month = 	 {02--04 May},
  publisher =    {PMLR},
  pdf = 	 {https://proceedings.mlr.press/v238/bickford-smith24a/bickford-smith24a.pdf},
  url = 	 {https://proceedings.mlr.press/v238/bickford-smith24a.html},
  abstract = 	 {Fully supervised models are predominant in Bayesian active learning. We argue that their neglect of the information present in unlabelled data harms not just predictive performance but also decisions about what data to acquire. Our proposed solution is a simple framework for semi-supervised Bayesian active learning. We find it produces better-performing models than either conventional Bayesian active learning or semi-supervised learning with randomly acquired data. It is also easier to scale up than the conventional approach. As well as supporting a shift towards semi-supervised models, our findings highlight the importance of studying models and acquisition methods in conjunction.}
}

Endnote

%0 Conference Paper
%T Making Better Use of Unlabelled Data in Bayesian Active Learning
%A Freddie Bickford Smith
%A Adam Foster
%A Tom Rainforth
%B Proceedings of The 27th International Conference on Artificial Intelligence and Statistics
%C Proceedings of Machine Learning Research
%D 2024
%E Sanjoy Dasgupta
%E Stephan Mandt
%E Yingzhen Li	
%F pmlr-v238-bickford-smith24a
%I PMLR
%P 847--855
%U https://proceedings.mlr.press/v238/bickford-smith24a.html
%V 238
%X Fully supervised models are predominant in Bayesian active learning. We argue that their neglect of the information present in unlabelled data harms not just predictive performance but also decisions about what data to acquire. Our proposed solution is a simple framework for semi-supervised Bayesian active learning. We find it produces better-performing models than either conventional Bayesian active learning or semi-supervised learning with randomly acquired data. It is also easier to scale up than the conventional approach. As well as supporting a shift towards semi-supervised models, our findings highlight the importance of studying models and acquisition methods in conjunction.

APA

Bickford Smith, F., Foster, A. & Rainforth, T.. (2024). Making Better Use of Unlabelled Data in Bayesian Active Learning. Proceedings of The 27th International Conference on Artificial Intelligence and Statistics, in Proceedings of Machine Learning Research 238:847-855 Available from https://proceedings.mlr.press/v238/bickford-smith24a.html.

Making Better Use of Unlabelled Data in Bayesian Active Learning

Abstract

Cite this Paper

Related Material