InterLUDE: Interactions between Labeled and Unlabeled Data to Enhance Semi-Supervised Learning

Zhe Huang, Xiaowei Yu, Dajiang Zhu, Michael C Hughes
Proceedings of the 41st International Conference on Machine Learning, PMLR 235:20452-20473, 2024.

Abstract

Semi-supervised learning (SSL) seeks to enhance task performance by training on both labeled and unlabeled data. Mainstream SSL image classification methods mostly optimize a loss that additively combines a supervised classification objective with a regularization term derived solely from unlabeled data. This formulation often neglects the potential for interaction between labeled and unlabeled images. In this paper, we introduce InterLUDE, a new approach to enhance SSL made of two parts that each benefit from labeled-unlabeled interaction. The first part, embedding fusion, interpolates between labeled and unlabeled embeddings to improve representation learning. The second part is a new loss, grounded in the principle of consistency regularization, that aims to minimize discrepancies in the model’s predictions between labeled versus unlabeled inputs. Experiments on standard closed-set SSL benchmarks and a medical SSL task with an uncurated unlabeled set show clear benefits to our approach. On the STL-10 dataset with only 40 labels, InterLUDE achieves 3.2% error rate, while the best previous method reports 6.3%.

Cite this Paper


BibTeX
@InProceedings{pmlr-v235-huang24af, title = {{I}nter{LUDE}: Interactions between Labeled and Unlabeled Data to Enhance Semi-Supervised Learning}, author = {Huang, Zhe and Yu, Xiaowei and Zhu, Dajiang and Hughes, Michael C}, booktitle = {Proceedings of the 41st International Conference on Machine Learning}, pages = {20452--20473}, year = {2024}, editor = {Salakhutdinov, Ruslan and Kolter, Zico and Heller, Katherine and Weller, Adrian and Oliver, Nuria and Scarlett, Jonathan and Berkenkamp, Felix}, volume = {235}, series = {Proceedings of Machine Learning Research}, month = {21--27 Jul}, publisher = {PMLR}, pdf = {https://raw.githubusercontent.com/mlresearch/v235/main/assets/huang24af/huang24af.pdf}, url = {https://proceedings.mlr.press/v235/huang24af.html}, abstract = {Semi-supervised learning (SSL) seeks to enhance task performance by training on both labeled and unlabeled data. Mainstream SSL image classification methods mostly optimize a loss that additively combines a supervised classification objective with a regularization term derived solely from unlabeled data. This formulation often neglects the potential for interaction between labeled and unlabeled images. In this paper, we introduce InterLUDE, a new approach to enhance SSL made of two parts that each benefit from labeled-unlabeled interaction. The first part, embedding fusion, interpolates between labeled and unlabeled embeddings to improve representation learning. The second part is a new loss, grounded in the principle of consistency regularization, that aims to minimize discrepancies in the model’s predictions between labeled versus unlabeled inputs. Experiments on standard closed-set SSL benchmarks and a medical SSL task with an uncurated unlabeled set show clear benefits to our approach. On the STL-10 dataset with only 40 labels, InterLUDE achieves 3.2% error rate, while the best previous method reports 6.3%.} }
Endnote
%0 Conference Paper %T InterLUDE: Interactions between Labeled and Unlabeled Data to Enhance Semi-Supervised Learning %A Zhe Huang %A Xiaowei Yu %A Dajiang Zhu %A Michael C Hughes %B Proceedings of the 41st International Conference on Machine Learning %C Proceedings of Machine Learning Research %D 2024 %E Ruslan Salakhutdinov %E Zico Kolter %E Katherine Heller %E Adrian Weller %E Nuria Oliver %E Jonathan Scarlett %E Felix Berkenkamp %F pmlr-v235-huang24af %I PMLR %P 20452--20473 %U https://proceedings.mlr.press/v235/huang24af.html %V 235 %X Semi-supervised learning (SSL) seeks to enhance task performance by training on both labeled and unlabeled data. Mainstream SSL image classification methods mostly optimize a loss that additively combines a supervised classification objective with a regularization term derived solely from unlabeled data. This formulation often neglects the potential for interaction between labeled and unlabeled images. In this paper, we introduce InterLUDE, a new approach to enhance SSL made of two parts that each benefit from labeled-unlabeled interaction. The first part, embedding fusion, interpolates between labeled and unlabeled embeddings to improve representation learning. The second part is a new loss, grounded in the principle of consistency regularization, that aims to minimize discrepancies in the model’s predictions between labeled versus unlabeled inputs. Experiments on standard closed-set SSL benchmarks and a medical SSL task with an uncurated unlabeled set show clear benefits to our approach. On the STL-10 dataset with only 40 labels, InterLUDE achieves 3.2% error rate, while the best previous method reports 6.3%.
APA
Huang, Z., Yu, X., Zhu, D. & Hughes, M.C.. (2024). InterLUDE: Interactions between Labeled and Unlabeled Data to Enhance Semi-Supervised Learning. Proceedings of the 41st International Conference on Machine Learning, in Proceedings of Machine Learning Research 235:20452-20473 Available from https://proceedings.mlr.press/v235/huang24af.html.

Related Material