[edit]
Non-Stationary Predictions May Be More Informative: Exploring Pseudo-Labels with a Two-Phase Pattern of Training Dynamics
Proceedings of the 42nd International Conference on Machine Learning, PMLR 267:48662-48678, 2025.
Abstract
Pseudo-labeling is a widely used strategy in semi-supervised learning. Existing methods typically select predicted labels with high confidence scores and high training stationarity, as pseudo-labels to augment training sets. In contrast, this paper explores the pseudo-labeling potential of predicted labels that do not exhibit these characteristics. We discover a new type of predicted labels suitable for pseudo-labeling, termed two-phase labels, which exhibit a two-phase pattern during training: they are initially predicted as one category in early training stages and switch to another category in subsequent epochs. Case studies show the two-phase labels are informative for decision boundaries. To effectively identify the two-phase labels, we design a 2-phasic metric that mathematically characterizes their spatial and temporal patterns. Furthermore, we propose a loss function tailored for two-phase pseudo-labeling learning, allowing models not only to learn correct correlations but also to eliminate false ones. Extensive experiments on eight datasets show that our proposed 2-phasic metric acts as a powerful booster for existing pseudo-labeling methods by additionally incorporating the two-phase labels, achieving an average classification accuracy gain of 1.73% on image datasets and 1.92% on graph datasets.