On the Effects of Artificial Data Modification

Antonia Marcu; Adam Prugel-Bennett

On the Effects of Artificial Data Modification

Antonia Marcu, Adam Prugel-Bennett

Proceedings of the 39th International Conference on Machine Learning, PMLR 162:15050-15069, 2022.

Abstract

Data distortion is commonly applied in vision models during both training (e.g methods like MixUp and CutMix) and evaluation (e.g. shape-texture bias and robustness). This data modification can introduce artificial information. It is often assumed that the resulting artefacts are detrimental to training, whilst being negligible when analysing models. We investigate these assumptions and conclude that in some cases they are unfounded and lead to incorrect results. Specifically, we show current shape bias identification methods and occlusion robustness measures are biased and propose a fairer alternative for the latter. Subsequently, through a series of experiments we seek to correct and strengthen the community’s perception of how augmenting affects learning of vision models. Based on our empirical results we argue that the impact of the artefacts must be understood and exploited rather than eliminated.

Cite this Paper

BibTeX

@InProceedings{pmlr-v162-marcu22a,
  title = 	 {On the Effects of Artificial Data Modification},
  author =       {Marcu, Antonia and Prugel-Bennett, Adam},
  booktitle = 	 {Proceedings of the 39th International Conference on Machine Learning},
  pages = 	 {15050--15069},
  year = 	 {2022},
  editor = 	 {Chaudhuri, Kamalika and Jegelka, Stefanie and Song, Le and Szepesvari, Csaba and Niu, Gang and Sabato, Sivan},
  volume = 	 {162},
  series = 	 {Proceedings of Machine Learning Research},
  month = 	 {17--23 Jul},
  publisher =    {PMLR},
  pdf = 	 {https://proceedings.mlr.press/v162/marcu22a/marcu22a.pdf},
  url = 	 {https://proceedings.mlr.press/v162/marcu22a.html},
  abstract = 	 {Data distortion is commonly applied in vision models during both training (e.g methods like MixUp and CutMix) and evaluation (e.g. shape-texture bias and robustness). This data modification can introduce artificial information. It is often assumed that the resulting artefacts are detrimental to training, whilst being negligible when analysing models. We investigate these assumptions and conclude that in some cases they are unfounded and lead to incorrect results. Specifically, we show current shape bias identification methods and occlusion robustness measures are biased and propose a fairer alternative for the latter. Subsequently, through a series of experiments we seek to correct and strengthen the community’s perception of how augmenting affects learning of vision models. Based on our empirical results we argue that the impact of the artefacts must be understood and exploited rather than eliminated.}
}

Endnote

%0 Conference Paper
%T On the Effects of Artificial Data Modification
%A Antonia Marcu
%A Adam Prugel-Bennett
%B Proceedings of the 39th International Conference on Machine Learning
%C Proceedings of Machine Learning Research
%D 2022
%E Kamalika Chaudhuri
%E Stefanie Jegelka
%E Le Song
%E Csaba Szepesvari
%E Gang Niu
%E Sivan Sabato	
%F pmlr-v162-marcu22a
%I PMLR
%P 15050--15069
%U https://proceedings.mlr.press/v162/marcu22a.html
%V 162
%X Data distortion is commonly applied in vision models during both training (e.g methods like MixUp and CutMix) and evaluation (e.g. shape-texture bias and robustness). This data modification can introduce artificial information. It is often assumed that the resulting artefacts are detrimental to training, whilst being negligible when analysing models. We investigate these assumptions and conclude that in some cases they are unfounded and lead to incorrect results. Specifically, we show current shape bias identification methods and occlusion robustness measures are biased and propose a fairer alternative for the latter. Subsequently, through a series of experiments we seek to correct and strengthen the community’s perception of how augmenting affects learning of vision models. Based on our empirical results we argue that the impact of the artefacts must be understood and exploited rather than eliminated.

APA

Marcu, A. & Prugel-Bennett, A.. (2022). On the Effects of Artificial Data Modification. Proceedings of the 39th International Conference on Machine Learning, in Proceedings of Machine Learning Research 162:15050-15069 Available from https://proceedings.mlr.press/v162/marcu22a.html.

Related Material

Download PDF