[edit]
Identity-Disentangled Adversarial Augmentation for Self-supervised Learning
Proceedings of the 39th International Conference on Machine Learning, PMLR 162:25364-25381, 2022.
Abstract
Data augmentation is critical to contrastive self-supervised learning, whose goal is to distinguish a sample’s augmentations (positives) from other samples (negatives). However, strong augmentations may change the sample-identity of the positives, while weak augmentation produces easy positives/negatives leading to nearly-zero loss and ineffective learning. In this paper, we study a simple adversarial augmentation method that can modify training data to be hard positives/negatives without distorting the key information about their original identities. In particular, we decompose a sample x to be its variational auto-encoder (VAE) reconstruction G(x) plus the residual R(x)=x−G(x), where R(x) retains most identity-distinctive information due to an information-theoretic interpretation of the VAE objective. We then adversarially perturb G(x) in the VAE’s bottleneck space and adds it back to the original R(x) as an augmentation, which is therefore sufficiently challenging for contrastive learning and meanwhile preserves the sample identity intact. We apply this “identity-disentangled adversarial augmentation (IDAA)” to different self-supervised learning methods. On multiple benchmark datasets, IDAA consistently improves both their efficiency and generalization performance. We further show that IDAA learned on a dataset can be transferred to other datasets. Code is available at \href{https://github.com/kai-wen-yang/IDAA}{https://github.com/kai-wen-yang/IDAA}.