Making Your First Choice: To Address Cold Start Problem in Medical Active Learning

Liangyu Chen, Yutong Bai, Siyu Huang, Yongyi Lu, Bihan Wen, Alan Yuille, Zongwei Zhou
Medical Imaging with Deep Learning, PMLR 227:496-525, 2024.

Abstract

Active learning promises to improve annotation efficiency by iteratively selecting the most important data to be annotated first. However, we uncover a striking contradiction to this promise: at the first few choices, active learning fails to select data as efficiently as random selection. We identify this as the cold start problem in active learning, caused by a biased and outlier initial query. This paper seeks to address the cold start problem and develops a novel active querying strategy, named HaCon, that can exploit the three advantages of contrastive learning: (1) no annotation is required; (2) label diversity is ensured by pseudo-labels to mitigate bias; (3) typical data is determined by contrastive features to reduce outliers. Experiments on three public medical datasets show that HaCon not only significantly outperforms existing active querying strategies but also surpasses random selection by a large margin. Code is available at https://github.com/liangyuch/CSVAL.

Cite this Paper


BibTeX
@InProceedings{pmlr-v227-chen24a, title = {Making Your First Choice: To Address Cold Start Problem in Medical Active Learning}, author = {Chen, Liangyu and Bai, Yutong and Huang, Siyu and Lu, Yongyi and Wen, Bihan and Yuille, Alan and Zhou, Zongwei}, booktitle = {Medical Imaging with Deep Learning}, pages = {496--525}, year = {2024}, editor = {Oguz, Ipek and Noble, Jack and Li, Xiaoxiao and Styner, Martin and Baumgartner, Christian and Rusu, Mirabela and Heinmann, Tobias and Kontos, Despina and Landman, Bennett and Dawant, Benoit}, volume = {227}, series = {Proceedings of Machine Learning Research}, month = {10--12 Jul}, publisher = {PMLR}, pdf = {https://proceedings.mlr.press/v227/chen24a/chen24a.pdf}, url = {https://proceedings.mlr.press/v227/chen24a.html}, abstract = {Active learning promises to improve annotation efficiency by iteratively selecting the most important data to be annotated first. However, we uncover a striking contradiction to this promise: at the first few choices, active learning fails to select data as efficiently as random selection. We identify this as the cold start problem in active learning, caused by a biased and outlier initial query. This paper seeks to address the cold start problem and develops a novel active querying strategy, named HaCon, that can exploit the three advantages of contrastive learning: (1) no annotation is required; (2) label diversity is ensured by pseudo-labels to mitigate bias; (3) typical data is determined by contrastive features to reduce outliers. Experiments on three public medical datasets show that HaCon not only significantly outperforms existing active querying strategies but also surpasses random selection by a large margin. Code is available at https://github.com/liangyuch/CSVAL.} }
Endnote
%0 Conference Paper %T Making Your First Choice: To Address Cold Start Problem in Medical Active Learning %A Liangyu Chen %A Yutong Bai %A Siyu Huang %A Yongyi Lu %A Bihan Wen %A Alan Yuille %A Zongwei Zhou %B Medical Imaging with Deep Learning %C Proceedings of Machine Learning Research %D 2024 %E Ipek Oguz %E Jack Noble %E Xiaoxiao Li %E Martin Styner %E Christian Baumgartner %E Mirabela Rusu %E Tobias Heinmann %E Despina Kontos %E Bennett Landman %E Benoit Dawant %F pmlr-v227-chen24a %I PMLR %P 496--525 %U https://proceedings.mlr.press/v227/chen24a.html %V 227 %X Active learning promises to improve annotation efficiency by iteratively selecting the most important data to be annotated first. However, we uncover a striking contradiction to this promise: at the first few choices, active learning fails to select data as efficiently as random selection. We identify this as the cold start problem in active learning, caused by a biased and outlier initial query. This paper seeks to address the cold start problem and develops a novel active querying strategy, named HaCon, that can exploit the three advantages of contrastive learning: (1) no annotation is required; (2) label diversity is ensured by pseudo-labels to mitigate bias; (3) typical data is determined by contrastive features to reduce outliers. Experiments on three public medical datasets show that HaCon not only significantly outperforms existing active querying strategies but also surpasses random selection by a large margin. Code is available at https://github.com/liangyuch/CSVAL.
APA
Chen, L., Bai, Y., Huang, S., Lu, Y., Wen, B., Yuille, A. & Zhou, Z.. (2024). Making Your First Choice: To Address Cold Start Problem in Medical Active Learning. Medical Imaging with Deep Learning, in Proceedings of Machine Learning Research 227:496-525 Available from https://proceedings.mlr.press/v227/chen24a.html.

Related Material