JSOSAL: Joint Sampling for Open-Set Active Learning

Yongxiang Zhang, Bo Zhang, Zhiqiang Dai, Yangjie Cao
Proceedings of 2025 2nd International Conference on Machine Learning and Intelligent Computing, PMLR 278:614-621, 2025.

Abstract

Traditional active learning methods typically operate under closed-set assumptions, where unlabeled data samples are selected for annotation from a pool consisting exclusively of known classes. However, real-world scenarios predominantly exhibit open-set conditions, characterized by the presence of substantial unknown-class instances within datasets. This fundamental discrepancy renders most conventional active learning approaches ineffective in practical applications.To address the annotation challenge in open-set environments, we propose JSOSAL (Joint Sampling for Open-Set Active Learning), an innovative approach that applies a Bayesian Gaussian Mixture Model (BGMM) to represent the probability distribution of the highest activation values, enabling effective discrimination between known and unknown classes. Our method subsequently selects high-entropy samples from the identified known-class subset for annotation. Rigorous testing on CIFAR-10 and CIFAR-100 shows that JSOSAL achieves superior performance compared to existing leading methods.

Cite this Paper


BibTeX
@InProceedings{pmlr-v278-zhang25e, title = {JSOSAL: Joint Sampling for Open-Set Active Learning}, author = {Zhang, Yongxiang and Zhang, Bo and Dai, Zhiqiang and Cao, Yangjie}, booktitle = {Proceedings of 2025 2nd International Conference on Machine Learning and Intelligent Computing}, pages = {614--621}, year = {2025}, editor = {Zeng, Nianyin and Pachori, Ram Bilas and Wang, Dongshu}, volume = {278}, series = {Proceedings of Machine Learning Research}, month = {25--27 Apr}, publisher = {PMLR}, pdf = {https://raw.githubusercontent.com/mlresearch/v278/main/assets/zhang25e/zhang25e.pdf}, url = {https://proceedings.mlr.press/v278/zhang25e.html}, abstract = { Traditional active learning methods typically operate under closed-set assumptions, where unlabeled data samples are selected for annotation from a pool consisting exclusively of known classes. However, real-world scenarios predominantly exhibit open-set conditions, characterized by the presence of substantial unknown-class instances within datasets. This fundamental discrepancy renders most conventional active learning approaches ineffective in practical applications.To address the annotation challenge in open-set environments, we propose JSOSAL (Joint Sampling for Open-Set Active Learning), an innovative approach that applies a Bayesian Gaussian Mixture Model (BGMM) to represent the probability distribution of the highest activation values, enabling effective discrimination between known and unknown classes. Our method subsequently selects high-entropy samples from the identified known-class subset for annotation. Rigorous testing on CIFAR-10 and CIFAR-100 shows that JSOSAL achieves superior performance compared to existing leading methods.} }
Endnote
%0 Conference Paper %T JSOSAL: Joint Sampling for Open-Set Active Learning %A Yongxiang Zhang %A Bo Zhang %A Zhiqiang Dai %A Yangjie Cao %B Proceedings of 2025 2nd International Conference on Machine Learning and Intelligent Computing %C Proceedings of Machine Learning Research %D 2025 %E Nianyin Zeng %E Ram Bilas Pachori %E Dongshu Wang %F pmlr-v278-zhang25e %I PMLR %P 614--621 %U https://proceedings.mlr.press/v278/zhang25e.html %V 278 %X Traditional active learning methods typically operate under closed-set assumptions, where unlabeled data samples are selected for annotation from a pool consisting exclusively of known classes. However, real-world scenarios predominantly exhibit open-set conditions, characterized by the presence of substantial unknown-class instances within datasets. This fundamental discrepancy renders most conventional active learning approaches ineffective in practical applications.To address the annotation challenge in open-set environments, we propose JSOSAL (Joint Sampling for Open-Set Active Learning), an innovative approach that applies a Bayesian Gaussian Mixture Model (BGMM) to represent the probability distribution of the highest activation values, enabling effective discrimination between known and unknown classes. Our method subsequently selects high-entropy samples from the identified known-class subset for annotation. Rigorous testing on CIFAR-10 and CIFAR-100 shows that JSOSAL achieves superior performance compared to existing leading methods.
APA
Zhang, Y., Zhang, B., Dai, Z. & Cao, Y.. (2025). JSOSAL: Joint Sampling for Open-Set Active Learning. Proceedings of 2025 2nd International Conference on Machine Learning and Intelligent Computing, in Proceedings of Machine Learning Research 278:614-621 Available from https://proceedings.mlr.press/v278/zhang25e.html.

Related Material