Large Scale Dataset Distillation with Domain Shift

Noel Loo, Alaa Maalouf, Ramin Hasani, Mathias Lechner, Alexander Amini, Daniela Rus
Proceedings of the 41st International Conference on Machine Learning, PMLR 235:32759-32780, 2024.

Abstract

Dataset Distillation seeks to summarize a large dataset by generating a reduced set of synthetic samples. While there has been much success at distilling small datasets such as CIFAR-10 on smaller neural architectures, Dataset Distillation methods fail to scale to larger high-resolution datasets and architectures. In this work, we introduce Dataset Distillation with Domain Shift (D3S), a scalable distillation algorithm, made by reframing the dataset distillation problem as a domain shift one. In doing so, we derive a universal bound on the distillation loss, and provide a method for efficiently approximately optimizing it. We achieve state-of-the-art results on Tiny-ImageNet, ImageNet-1k, and ImageNet-21K over a variety of recently proposed baselines, including high cross-architecture generalization. Additionally, our ablation studies provide lessons on the importance of validation-time hyperparameters on distillation performance, motivating the need for standardization.

Cite this Paper


BibTeX
@InProceedings{pmlr-v235-loo24a, title = {Large Scale Dataset Distillation with Domain Shift}, author = {Loo, Noel and Maalouf, Alaa and Hasani, Ramin and Lechner, Mathias and Amini, Alexander and Rus, Daniela}, booktitle = {Proceedings of the 41st International Conference on Machine Learning}, pages = {32759--32780}, year = {2024}, editor = {Salakhutdinov, Ruslan and Kolter, Zico and Heller, Katherine and Weller, Adrian and Oliver, Nuria and Scarlett, Jonathan and Berkenkamp, Felix}, volume = {235}, series = {Proceedings of Machine Learning Research}, month = {21--27 Jul}, publisher = {PMLR}, pdf = {https://raw.githubusercontent.com/mlresearch/v235/main/assets/loo24a/loo24a.pdf}, url = {https://proceedings.mlr.press/v235/loo24a.html}, abstract = {Dataset Distillation seeks to summarize a large dataset by generating a reduced set of synthetic samples. While there has been much success at distilling small datasets such as CIFAR-10 on smaller neural architectures, Dataset Distillation methods fail to scale to larger high-resolution datasets and architectures. In this work, we introduce Dataset Distillation with Domain Shift (D3S), a scalable distillation algorithm, made by reframing the dataset distillation problem as a domain shift one. In doing so, we derive a universal bound on the distillation loss, and provide a method for efficiently approximately optimizing it. We achieve state-of-the-art results on Tiny-ImageNet, ImageNet-1k, and ImageNet-21K over a variety of recently proposed baselines, including high cross-architecture generalization. Additionally, our ablation studies provide lessons on the importance of validation-time hyperparameters on distillation performance, motivating the need for standardization.} }
Endnote
%0 Conference Paper %T Large Scale Dataset Distillation with Domain Shift %A Noel Loo %A Alaa Maalouf %A Ramin Hasani %A Mathias Lechner %A Alexander Amini %A Daniela Rus %B Proceedings of the 41st International Conference on Machine Learning %C Proceedings of Machine Learning Research %D 2024 %E Ruslan Salakhutdinov %E Zico Kolter %E Katherine Heller %E Adrian Weller %E Nuria Oliver %E Jonathan Scarlett %E Felix Berkenkamp %F pmlr-v235-loo24a %I PMLR %P 32759--32780 %U https://proceedings.mlr.press/v235/loo24a.html %V 235 %X Dataset Distillation seeks to summarize a large dataset by generating a reduced set of synthetic samples. While there has been much success at distilling small datasets such as CIFAR-10 on smaller neural architectures, Dataset Distillation methods fail to scale to larger high-resolution datasets and architectures. In this work, we introduce Dataset Distillation with Domain Shift (D3S), a scalable distillation algorithm, made by reframing the dataset distillation problem as a domain shift one. In doing so, we derive a universal bound on the distillation loss, and provide a method for efficiently approximately optimizing it. We achieve state-of-the-art results on Tiny-ImageNet, ImageNet-1k, and ImageNet-21K over a variety of recently proposed baselines, including high cross-architecture generalization. Additionally, our ablation studies provide lessons on the importance of validation-time hyperparameters on distillation performance, motivating the need for standardization.
APA
Loo, N., Maalouf, A., Hasani, R., Lechner, M., Amini, A. & Rus, D.. (2024). Large Scale Dataset Distillation with Domain Shift. Proceedings of the 41st International Conference on Machine Learning, in Proceedings of Machine Learning Research 235:32759-32780 Available from https://proceedings.mlr.press/v235/loo24a.html.

Related Material