Sample Efficiency of Data Augmentation Consistency Regularization

Shuo Yang, Yijun Dong, Rachel Ward, Inderjit S. Dhillon, Sujay Sanghavi, Qi Lei
Proceedings of The 26th International Conference on Artificial Intelligence and Statistics, PMLR 206:3825-3853, 2023.

Abstract

Data augmentation is popular in the training of large neural networks; however, currently, theoretical understanding of the discrepancy between different algorithmic choices of leveraging augmented data remains limited. In this paper, we take a step in this direction – we first present a simple and novel analysis for linear regression with label invariant augmentations, demonstrating that data augmentation consistency (DAC) is intrinsically more efficient than empirical risk minimization on augmented data (DA-ERM). The analysis is then generalized to misspecified augmentations (i.e., augmentations that change the labels), which again demonstrates the merit of DAC over DA-ERM. Further, we extend our analysis to non-linear models (e.g., neural networks) and present generalization bounds. Finally, we perform experiments that make a clean and apples-to-apples comparison (i.e., with no extra modeling or data tweaks) between DAC and DA-ERM using CIFAR-100 and WideResNet; these together demonstrate the superior efficacy of DAC.

Cite this Paper


BibTeX
@InProceedings{pmlr-v206-yang23c, title = {Sample Efficiency of Data Augmentation Consistency Regularization}, author = {Yang, Shuo and Dong, Yijun and Ward, Rachel and Dhillon, Inderjit S. and Sanghavi, Sujay and Lei, Qi}, booktitle = {Proceedings of The 26th International Conference on Artificial Intelligence and Statistics}, pages = {3825--3853}, year = {2023}, editor = {Ruiz, Francisco and Dy, Jennifer and van de Meent, Jan-Willem}, volume = {206}, series = {Proceedings of Machine Learning Research}, month = {25--27 Apr}, publisher = {PMLR}, pdf = {https://proceedings.mlr.press/v206/yang23c/yang23c.pdf}, url = {https://proceedings.mlr.press/v206/yang23c.html}, abstract = {Data augmentation is popular in the training of large neural networks; however, currently, theoretical understanding of the discrepancy between different algorithmic choices of leveraging augmented data remains limited. In this paper, we take a step in this direction – we first present a simple and novel analysis for linear regression with label invariant augmentations, demonstrating that data augmentation consistency (DAC) is intrinsically more efficient than empirical risk minimization on augmented data (DA-ERM). The analysis is then generalized to misspecified augmentations (i.e., augmentations that change the labels), which again demonstrates the merit of DAC over DA-ERM. Further, we extend our analysis to non-linear models (e.g., neural networks) and present generalization bounds. Finally, we perform experiments that make a clean and apples-to-apples comparison (i.e., with no extra modeling or data tweaks) between DAC and DA-ERM using CIFAR-100 and WideResNet; these together demonstrate the superior efficacy of DAC.} }
Endnote
%0 Conference Paper %T Sample Efficiency of Data Augmentation Consistency Regularization %A Shuo Yang %A Yijun Dong %A Rachel Ward %A Inderjit S. Dhillon %A Sujay Sanghavi %A Qi Lei %B Proceedings of The 26th International Conference on Artificial Intelligence and Statistics %C Proceedings of Machine Learning Research %D 2023 %E Francisco Ruiz %E Jennifer Dy %E Jan-Willem van de Meent %F pmlr-v206-yang23c %I PMLR %P 3825--3853 %U https://proceedings.mlr.press/v206/yang23c.html %V 206 %X Data augmentation is popular in the training of large neural networks; however, currently, theoretical understanding of the discrepancy between different algorithmic choices of leveraging augmented data remains limited. In this paper, we take a step in this direction – we first present a simple and novel analysis for linear regression with label invariant augmentations, demonstrating that data augmentation consistency (DAC) is intrinsically more efficient than empirical risk minimization on augmented data (DA-ERM). The analysis is then generalized to misspecified augmentations (i.e., augmentations that change the labels), which again demonstrates the merit of DAC over DA-ERM. Further, we extend our analysis to non-linear models (e.g., neural networks) and present generalization bounds. Finally, we perform experiments that make a clean and apples-to-apples comparison (i.e., with no extra modeling or data tweaks) between DAC and DA-ERM using CIFAR-100 and WideResNet; these together demonstrate the superior efficacy of DAC.
APA
Yang, S., Dong, Y., Ward, R., Dhillon, I.S., Sanghavi, S. & Lei, Q.. (2023). Sample Efficiency of Data Augmentation Consistency Regularization. Proceedings of The 26th International Conference on Artificial Intelligence and Statistics, in Proceedings of Machine Learning Research 206:3825-3853 Available from https://proceedings.mlr.press/v206/yang23c.html.

Related Material