Unlabeled Data Help: Minimax Analysis and Adversarial Robustness
Proceedings of The 25th International Conference on Artificial Intelligence and Statistics, PMLR 151:136-168, 2022.
The recent proposed self-supervised learning (SSL) approaches successfully demonstrate the great potential of supplementing learning algorithms with additional unlabeled data. However, it is still unclear whether the existing SSL algorithms can fully utilize the information of both labelled and unlabeled data. This paper gives an affirmative answer for the reconstruction-based SSL algorithm (Lee et al., 2020) under several statistical models. While existing literature only focuses on establishing the upper bound of the convergence rate, we provide a rigorous minimax analysis, and successfully justify the rate-optimality of the reconstruction-based SSL algorithm under different data generation models. Furthermore, we incorporate the reconstruction-based SSL into the exist- ing adversarial training algorithms and show that learning from unlabeled data helps improve the robustness.