OCD: Learning to Overfit with Conditional Diffusion Models

Shahar Lutati; Lior Wolf

OCD: Learning to Overfit with Conditional Diffusion Models

Shahar Lutati, Lior Wolf

Proceedings of the 40th International Conference on Machine Learning, PMLR 202:23157-23169, 2023.

Abstract

We present a dynamic model in which the weights are conditioned on an input sample x and are learned to match those that would be obtained by finetuning a base model on x and its label y. This mapping between an input sample and network weights is approximated by a denoising diffusion model. The diffusion model we employ focuses on modifying a single layer of the base model and is conditioned on the input, activations, and output of this layer. Since the diffusion model is stochastic in nature, multiple initializations generate different networks, forming an ensemble, which leads to further improvements. Our experiments demonstrate the wide applicability of the method for image classification, 3D reconstruction, tabular data, speech separation, and natural language processing.

Cite this Paper

BibTeX


@InProceedings{pmlr-v202-lutati23a,
  title = 	 {{OCD}: Learning to Overfit with Conditional Diffusion Models},
  author =       {Lutati, Shahar and Wolf, Lior},
  booktitle = 	 {Proceedings of the 40th International Conference on Machine Learning},
  pages = 	 {23157--23169},
  year = 	 {2023},
  editor = 	 {Krause, Andreas and Brunskill, Emma and Cho, Kyunghyun and Engelhardt, Barbara and Sabato, Sivan and Scarlett, Jonathan},
  volume = 	 {202},
  series = 	 {Proceedings of Machine Learning Research},
  month = 	 {23--29 Jul},
  publisher =    {PMLR},
  pdf = 	 {https://proceedings.mlr.press/v202/lutati23a/lutati23a.pdf},
  url = 	 {https://proceedings.mlr.press/v202/lutati23a.html},
  abstract = 	 {We present a dynamic model in which the weights are conditioned on an input sample x and are learned to match those that would be obtained by finetuning a base model on x and its label y. This mapping between an input sample and network weights is approximated by a denoising diffusion model. The diffusion model we employ focuses on modifying a single layer of the base model and is conditioned on the input, activations, and output of this layer. Since the diffusion model is stochastic in nature, multiple initializations generate different networks, forming an ensemble, which leads to further improvements. Our experiments demonstrate the wide applicability of the method for image classification, 3D reconstruction, tabular data, speech separation, and natural language processing.}
}

Endnote

%0 Conference Paper
%T OCD: Learning to Overfit with Conditional Diffusion Models
%A Shahar Lutati
%A Lior Wolf
%B Proceedings of the 40th International Conference on Machine Learning
%C Proceedings of Machine Learning Research
%D 2023
%E Andreas Krause
%E Emma Brunskill
%E Kyunghyun Cho
%E Barbara Engelhardt
%E Sivan Sabato
%E Jonathan Scarlett	
%F pmlr-v202-lutati23a
%I PMLR
%P 23157--23169
%U https://proceedings.mlr.press/v202/lutati23a.html
%V 202
%X We present a dynamic model in which the weights are conditioned on an input sample x and are learned to match those that would be obtained by finetuning a base model on x and its label y. This mapping between an input sample and network weights is approximated by a denoising diffusion model. The diffusion model we employ focuses on modifying a single layer of the base model and is conditioned on the input, activations, and output of this layer. Since the diffusion model is stochastic in nature, multiple initializations generate different networks, forming an ensemble, which leads to further improvements. Our experiments demonstrate the wide applicability of the method for image classification, 3D reconstruction, tabular data, speech separation, and natural language processing.

APA


Lutati, S. & Wolf, L.. (2023). OCD: Learning to Overfit with Conditional Diffusion Models. Proceedings of the 40th International Conference on Machine Learning, in Proceedings of Machine Learning Research 202:23157-23169 Available from https://proceedings.mlr.press/v202/lutati23a.html.

OCD: Learning to Overfit with Conditional Diffusion Models

Abstract

Cite this Paper

Related Material