Positive and Unlabeled Learning with Controlled Probability Boundary Fence

Changchun Li, Yuanchao Dai, Lei Feng, Ximing Li, Bing Wang, Jihong Ouyang
Proceedings of the 41st International Conference on Machine Learning, PMLR 235:27641-27652, 2024.

Abstract

Positive and Unlabeled (PU) learning refers to a special case of binary classification, and technically, it aims to induce a binary classifier from a few labeled positive training instances and loads of unlabeled instances. In this paper, we derive a theorem indicating that the probability boundary of the asymmetric disambiguation-free expected risk of PU learning is controlled by its asymmetric penalty, and we further empirically evaluated this theorem. Inspired by the theorem and its empirical evaluations, we propose an easy-to-implement two-stage PU learning method, namely Positive and Unlabeled Learning with Controlled Probability Boundary Fence (PULCPBF). In the first stage, we train a set of weak binary classifiers concerning different probability boundaries by minimizing the asymmetric disambiguation-free empirical risks with specific asymmetric penalty values. We can interpret these induced weak binary classifiers as a probability boundary fence. For each unlabeled instance, we can use the predictions to locate its class posterior probability and generate a stochastic label. In the second stage, we train a strong binary classifier over labeled positive training instances and all unlabeled instances with stochastic labels in a self-training manner. Extensive empirical results demonstrate that PULCPBF can achieve competitive performance compared with the existing PU learning baselines.

Cite this Paper


BibTeX
@InProceedings{pmlr-v235-li24p, title = {Positive and Unlabeled Learning with Controlled Probability Boundary Fence}, author = {Li, Changchun and Dai, Yuanchao and Feng, Lei and Li, Ximing and Wang, Bing and Ouyang, Jihong}, booktitle = {Proceedings of the 41st International Conference on Machine Learning}, pages = {27641--27652}, year = {2024}, editor = {Salakhutdinov, Ruslan and Kolter, Zico and Heller, Katherine and Weller, Adrian and Oliver, Nuria and Scarlett, Jonathan and Berkenkamp, Felix}, volume = {235}, series = {Proceedings of Machine Learning Research}, month = {21--27 Jul}, publisher = {PMLR}, pdf = {https://raw.githubusercontent.com/mlresearch/v235/main/assets/li24p/li24p.pdf}, url = {https://proceedings.mlr.press/v235/li24p.html}, abstract = {Positive and Unlabeled (PU) learning refers to a special case of binary classification, and technically, it aims to induce a binary classifier from a few labeled positive training instances and loads of unlabeled instances. In this paper, we derive a theorem indicating that the probability boundary of the asymmetric disambiguation-free expected risk of PU learning is controlled by its asymmetric penalty, and we further empirically evaluated this theorem. Inspired by the theorem and its empirical evaluations, we propose an easy-to-implement two-stage PU learning method, namely Positive and Unlabeled Learning with Controlled Probability Boundary Fence (PULCPBF). In the first stage, we train a set of weak binary classifiers concerning different probability boundaries by minimizing the asymmetric disambiguation-free empirical risks with specific asymmetric penalty values. We can interpret these induced weak binary classifiers as a probability boundary fence. For each unlabeled instance, we can use the predictions to locate its class posterior probability and generate a stochastic label. In the second stage, we train a strong binary classifier over labeled positive training instances and all unlabeled instances with stochastic labels in a self-training manner. Extensive empirical results demonstrate that PULCPBF can achieve competitive performance compared with the existing PU learning baselines.} }
Endnote
%0 Conference Paper %T Positive and Unlabeled Learning with Controlled Probability Boundary Fence %A Changchun Li %A Yuanchao Dai %A Lei Feng %A Ximing Li %A Bing Wang %A Jihong Ouyang %B Proceedings of the 41st International Conference on Machine Learning %C Proceedings of Machine Learning Research %D 2024 %E Ruslan Salakhutdinov %E Zico Kolter %E Katherine Heller %E Adrian Weller %E Nuria Oliver %E Jonathan Scarlett %E Felix Berkenkamp %F pmlr-v235-li24p %I PMLR %P 27641--27652 %U https://proceedings.mlr.press/v235/li24p.html %V 235 %X Positive and Unlabeled (PU) learning refers to a special case of binary classification, and technically, it aims to induce a binary classifier from a few labeled positive training instances and loads of unlabeled instances. In this paper, we derive a theorem indicating that the probability boundary of the asymmetric disambiguation-free expected risk of PU learning is controlled by its asymmetric penalty, and we further empirically evaluated this theorem. Inspired by the theorem and its empirical evaluations, we propose an easy-to-implement two-stage PU learning method, namely Positive and Unlabeled Learning with Controlled Probability Boundary Fence (PULCPBF). In the first stage, we train a set of weak binary classifiers concerning different probability boundaries by minimizing the asymmetric disambiguation-free empirical risks with specific asymmetric penalty values. We can interpret these induced weak binary classifiers as a probability boundary fence. For each unlabeled instance, we can use the predictions to locate its class posterior probability and generate a stochastic label. In the second stage, we train a strong binary classifier over labeled positive training instances and all unlabeled instances with stochastic labels in a self-training manner. Extensive empirical results demonstrate that PULCPBF can achieve competitive performance compared with the existing PU learning baselines.
APA
Li, C., Dai, Y., Feng, L., Li, X., Wang, B. & Ouyang, J.. (2024). Positive and Unlabeled Learning with Controlled Probability Boundary Fence. Proceedings of the 41st International Conference on Machine Learning, in Proceedings of Machine Learning Research 235:27641-27652 Available from https://proceedings.mlr.press/v235/li24p.html.

Related Material