Token-Aware Representation Augmentation for Fine-Grained Semi-Supervised Learning

Hongyang He; Yan Zhong; Xinyuan Song; Daizong Liu; Victor Sanchez

Token-Aware Representation Augmentation for Fine-Grained Semi-Supervised Learning

Hongyang He, Yan Zhong, Xinyuan Song, Daizong Liu, Victor Sanchez

Conference on Parsimony and Learning, PMLR 328:516-528, 2026.

Abstract

FixMatch is a widely adopted semi-supervised learning (SSL) framework that relies on consistency regularization between weakly and strongly augmented versions of unlabeled data. In the case of image classification, its reliance on indiscriminate image-level augmentations often leads to overfitting on early confident predictions while neglecting semantically rich but underexplored features. In this work, we introduce Token-Aware FixMatch (TA-FixMatch), a novel SSL framework that operates at the token representation level to enhance feature diversity and generalization. Specifically, we propose a token-aware masking strategy that identifies and softly suppresses the most influential tokens contributing to high-confidence predictions; and a structured token-level augmentation pipeline that perturbs, reorganizes, and semantically enriches the remaining tokens. These representation-level augmentations guide the model to attend to alternative evidence and discover complementary features, which is particularly beneficial in fine-grained classification tasks. Extensive experiments on standard (CIFAR-100, STL-10) and fine-grained (CUB-200-2011, NABirds, Stanford Cars) benchmarks demonstrate that TA-FixMatch outperforms existing state-of-the-art SSL methods under low-label regimes.

Cite this Paper

BibTeX

@InProceedings{pmlr-v328-he26b,
  title = 	 {Token-Aware Representation Augmentation for Fine-Grained Semi-Supervised Learning},
  author =       {He, Hongyang and Zhong, Yan and Song, Xinyuan and Liu, Daizong and Sanchez, Victor},
  booktitle = 	 {Conference on Parsimony and Learning},
  pages = 	 {516--528},
  year = 	 {2026},
  editor = 	 {Burkholz, Rebekka and Liu, Shiwei and Ravishankar, Saiprasad and Redman, William and Huang, Wei and Su, Weijie and Zhu, Zhihui},
  volume = 	 {328},
  series = 	 {Proceedings of Machine Learning Research},
  month = 	 {23--26 Mar},
  publisher =    {PMLR},
  pdf = 	 {https://raw.githubusercontent.com/mlresearch/v328/main/assets/he26b/he26b.pdf},
  url = 	 {https://proceedings.mlr.press/v328/he26b.html},
  abstract = 	 {FixMatch is a widely adopted semi-supervised learning (SSL) framework that relies on consistency regularization between weakly and strongly augmented versions of unlabeled data. In the case of image classification, its reliance on indiscriminate image-level augmentations often leads to overfitting on early confident predictions while neglecting semantically rich but underexplored features. In this work, we introduce Token-Aware FixMatch (TA-FixMatch), a novel SSL framework that operates at the token representation level to enhance feature diversity and generalization. Specifically, we propose a token-aware masking strategy that identifies and softly suppresses the most influential tokens contributing to high-confidence predictions; and a structured token-level augmentation pipeline that perturbs, reorganizes, and semantically enriches the remaining tokens. These representation-level augmentations guide the model to attend to alternative evidence and discover complementary features, which is particularly beneficial in fine-grained classification tasks. Extensive experiments on standard (CIFAR-100, STL-10) and fine-grained (CUB-200-2011, NABirds, Stanford Cars) benchmarks demonstrate that TA-FixMatch outperforms existing state-of-the-art SSL methods under low-label regimes.}
}

Endnote

%0 Conference Paper
%T Token-Aware Representation Augmentation for Fine-Grained Semi-Supervised Learning
%A Hongyang He
%A Yan Zhong
%A Xinyuan Song
%A Daizong Liu
%A Victor Sanchez
%B Conference on Parsimony and Learning
%C Proceedings of Machine Learning Research
%D 2026
%E Rebekka Burkholz
%E Shiwei Liu
%E Saiprasad Ravishankar
%E William Redman
%E Wei Huang
%E Weijie Su
%E Zhihui Zhu	
%F pmlr-v328-he26b
%I PMLR
%P 516--528
%U https://proceedings.mlr.press/v328/he26b.html
%V 328
%X FixMatch is a widely adopted semi-supervised learning (SSL) framework that relies on consistency regularization between weakly and strongly augmented versions of unlabeled data. In the case of image classification, its reliance on indiscriminate image-level augmentations often leads to overfitting on early confident predictions while neglecting semantically rich but underexplored features. In this work, we introduce Token-Aware FixMatch (TA-FixMatch), a novel SSL framework that operates at the token representation level to enhance feature diversity and generalization. Specifically, we propose a token-aware masking strategy that identifies and softly suppresses the most influential tokens contributing to high-confidence predictions; and a structured token-level augmentation pipeline that perturbs, reorganizes, and semantically enriches the remaining tokens. These representation-level augmentations guide the model to attend to alternative evidence and discover complementary features, which is particularly beneficial in fine-grained classification tasks. Extensive experiments on standard (CIFAR-100, STL-10) and fine-grained (CUB-200-2011, NABirds, Stanford Cars) benchmarks demonstrate that TA-FixMatch outperforms existing state-of-the-art SSL methods under low-label regimes.

APA

He, H., Zhong, Y., Song, X., Liu, D. & Sanchez, V.. (2026). Token-Aware Representation Augmentation for Fine-Grained Semi-Supervised Learning. Conference on Parsimony and Learning, in Proceedings of Machine Learning Research 328:516-528 Available from https://proceedings.mlr.press/v328/he26b.html.

Token-Aware Representation Augmentation for Fine-Grained Semi-Supervised Learning

Abstract

Cite this Paper

Related Material