MODeL: Memory Optimizations for Deep Learning

Benoit Steiner, Mostafa Elhoushi, Jacob Kahn, James Hegarty
Proceedings of the 40th International Conference on Machine Learning, PMLR 202:32618-32632, 2023.

Abstract

The size of deep neural networks has grown exponentially in recent years. Unfortunately, hardware devices have not kept pace with the rapidly increasing memory requirements. To cope with this, researchers have proposed various techniques including spilling, rematerialization, reduced precision training, model pruning, and so on. However, these approaches suffer from various limitations, such as increasing training time, affecting model accuracy, or requiring extensive manual modifications to the neural networks. We present MODeL, an algorithm that optimizes the lifetime and memory location of the tensors used to train neural networks. Our method automatically reduces the memory usage of existing neural networks without any of the drawbacks of other techniques. We formulate the problem as a joint integer linear program (ILP). We present several techniques to simplify the encoding of the problem, and enable our approach to scale to the size of state-of-the-art neural networks using an off-the-shelf ILP solver. We experimentally demonstrate that MODeL only takes seconds to allow the training of neural networks using 30% less memory on average.

Cite this Paper


BibTeX
@InProceedings{pmlr-v202-steiner23a, title = {{MOD}e{L}: Memory Optimizations for Deep Learning}, author = {Steiner, Benoit and Elhoushi, Mostafa and Kahn, Jacob and Hegarty, James}, booktitle = {Proceedings of the 40th International Conference on Machine Learning}, pages = {32618--32632}, year = {2023}, editor = {Krause, Andreas and Brunskill, Emma and Cho, Kyunghyun and Engelhardt, Barbara and Sabato, Sivan and Scarlett, Jonathan}, volume = {202}, series = {Proceedings of Machine Learning Research}, month = {23--29 Jul}, publisher = {PMLR}, pdf = {https://proceedings.mlr.press/v202/steiner23a/steiner23a.pdf}, url = {https://proceedings.mlr.press/v202/steiner23a.html}, abstract = {The size of deep neural networks has grown exponentially in recent years. Unfortunately, hardware devices have not kept pace with the rapidly increasing memory requirements. To cope with this, researchers have proposed various techniques including spilling, rematerialization, reduced precision training, model pruning, and so on. However, these approaches suffer from various limitations, such as increasing training time, affecting model accuracy, or requiring extensive manual modifications to the neural networks. We present MODeL, an algorithm that optimizes the lifetime and memory location of the tensors used to train neural networks. Our method automatically reduces the memory usage of existing neural networks without any of the drawbacks of other techniques. We formulate the problem as a joint integer linear program (ILP). We present several techniques to simplify the encoding of the problem, and enable our approach to scale to the size of state-of-the-art neural networks using an off-the-shelf ILP solver. We experimentally demonstrate that MODeL only takes seconds to allow the training of neural networks using 30% less memory on average.} }
Endnote
%0 Conference Paper %T MODeL: Memory Optimizations for Deep Learning %A Benoit Steiner %A Mostafa Elhoushi %A Jacob Kahn %A James Hegarty %B Proceedings of the 40th International Conference on Machine Learning %C Proceedings of Machine Learning Research %D 2023 %E Andreas Krause %E Emma Brunskill %E Kyunghyun Cho %E Barbara Engelhardt %E Sivan Sabato %E Jonathan Scarlett %F pmlr-v202-steiner23a %I PMLR %P 32618--32632 %U https://proceedings.mlr.press/v202/steiner23a.html %V 202 %X The size of deep neural networks has grown exponentially in recent years. Unfortunately, hardware devices have not kept pace with the rapidly increasing memory requirements. To cope with this, researchers have proposed various techniques including spilling, rematerialization, reduced precision training, model pruning, and so on. However, these approaches suffer from various limitations, such as increasing training time, affecting model accuracy, or requiring extensive manual modifications to the neural networks. We present MODeL, an algorithm that optimizes the lifetime and memory location of the tensors used to train neural networks. Our method automatically reduces the memory usage of existing neural networks without any of the drawbacks of other techniques. We formulate the problem as a joint integer linear program (ILP). We present several techniques to simplify the encoding of the problem, and enable our approach to scale to the size of state-of-the-art neural networks using an off-the-shelf ILP solver. We experimentally demonstrate that MODeL only takes seconds to allow the training of neural networks using 30% less memory on average.
APA
Steiner, B., Elhoushi, M., Kahn, J. & Hegarty, J.. (2023). MODeL: Memory Optimizations for Deep Learning. Proceedings of the 40th International Conference on Machine Learning, in Proceedings of Machine Learning Research 202:32618-32632 Available from https://proceedings.mlr.press/v202/steiner23a.html.

Related Material