Provably Strict Generalisation Benefit for Equivariant Models

Bryn Elesedy; Sheheryar Zaidi

Provably Strict Generalisation Benefit for Equivariant Models

Bryn Elesedy, Sheheryar Zaidi

Proceedings of the 38th International Conference on Machine Learning, PMLR 139:2959-2969, 2021.

Abstract

It is widely believed that engineering a model to be invariant/equivariant improves generalisation. Despite the growing popularity of this approach, a precise characterisation of the generalisation benefit is lacking. By considering the simplest case of linear models, this paper provides the first provably non-zero improvement in generalisation for invariant/equivariant models when the target distribution is invariant/equivariant with respect to a compact group. Moreover, our work reveals an interesting relationship between generalisation, the number of training examples and properties of the group action. Our results rest on an observation of the structure of function spaces under averaging operators which, along with its consequences for feature averaging, may be of independent interest.

Cite this Paper

BibTeX


@InProceedings{pmlr-v139-elesedy21a,
  title = 	 {Provably Strict Generalisation Benefit for Equivariant Models},
  author =       {Elesedy, Bryn and Zaidi, Sheheryar},
  booktitle = 	 {Proceedings of the 38th International Conference on Machine Learning},
  pages = 	 {2959--2969},
  year = 	 {2021},
  editor = 	 {Meila, Marina and Zhang, Tong},
  volume = 	 {139},
  series = 	 {Proceedings of Machine Learning Research},
  month = 	 {18--24 Jul},
  publisher =    {PMLR},
  pdf = 	 {http://proceedings.mlr.press/v139/elesedy21a/elesedy21a.pdf},
  url = 	 {https://proceedings.mlr.press/v139/elesedy21a.html},
  abstract = 	 {It is widely believed that engineering a model to be invariant/equivariant improves generalisation. Despite the growing popularity of this approach, a precise characterisation of the generalisation benefit is lacking. By considering the simplest case of linear models, this paper provides the first provably non-zero improvement in generalisation for invariant/equivariant models when the target distribution is invariant/equivariant with respect to a compact group. Moreover, our work reveals an interesting relationship between generalisation, the number of training examples and properties of the group action. Our results rest on an observation of the structure of function spaces under averaging operators which, along with its consequences for feature averaging, may be of independent interest.}
}

Endnote

%0 Conference Paper
%T Provably Strict Generalisation Benefit for Equivariant Models
%A Bryn Elesedy
%A Sheheryar Zaidi
%B Proceedings of the 38th International Conference on Machine Learning
%C Proceedings of Machine Learning Research
%D 2021
%E Marina Meila
%E Tong Zhang	
%F pmlr-v139-elesedy21a
%I PMLR
%P 2959--2969
%U https://proceedings.mlr.press/v139/elesedy21a.html
%V 139
%X It is widely believed that engineering a model to be invariant/equivariant improves generalisation. Despite the growing popularity of this approach, a precise characterisation of the generalisation benefit is lacking. By considering the simplest case of linear models, this paper provides the first provably non-zero improvement in generalisation for invariant/equivariant models when the target distribution is invariant/equivariant with respect to a compact group. Moreover, our work reveals an interesting relationship between generalisation, the number of training examples and properties of the group action. Our results rest on an observation of the structure of function spaces under averaging operators which, along with its consequences for feature averaging, may be of independent interest.

APA


Elesedy, B. & Zaidi, S.. (2021). Provably Strict Generalisation Benefit for Equivariant Models. Proceedings of the 38th International Conference on Machine Learning, in Proceedings of Machine Learning Research 139:2959-2969 Available from https://proceedings.mlr.press/v139/elesedy21a.html.

Provably Strict Generalisation Benefit for Equivariant Models

Abstract

Cite this Paper

Related Material