Moccasin: Efficient Tensor Rematerialization for Neural Networks

Burak Bartan, Haoming Li, Harris Teague, Christopher Lott, Bistra Dilkina
Proceedings of the 40th International Conference on Machine Learning, PMLR 202:1826-1837, 2023.

Abstract

The deployment and training of neural networks on edge computing devices pose many challenges. The low memory nature of edge devices is often one of the biggest limiting factors encountered in the deployment of large neural network models. Tensor rematerialization or recompute is a way to address high memory requirements for neural network training and inference. In this paper we consider the problem of execution time minimization of compute graphs subject to a memory budget. In particular, we develop a new constraint programming formulation called Moccasin with only $O(n)$ integer variables, where $n$ is the number of nodes in the compute graph. This is a significant improvement over the works in the recent literature that propose formulations with $O(n^2)$ Boolean variables. We present numerical studies that show that our approach is up to an order of magnitude faster than recent work especially for large-scale graphs.

Cite this Paper


BibTeX
@InProceedings{pmlr-v202-bartan23a, title = {Moccasin: Efficient Tensor Rematerialization for Neural Networks}, author = {Bartan, Burak and Li, Haoming and Teague, Harris and Lott, Christopher and Dilkina, Bistra}, booktitle = {Proceedings of the 40th International Conference on Machine Learning}, pages = {1826--1837}, year = {2023}, editor = {Krause, Andreas and Brunskill, Emma and Cho, Kyunghyun and Engelhardt, Barbara and Sabato, Sivan and Scarlett, Jonathan}, volume = {202}, series = {Proceedings of Machine Learning Research}, month = {23--29 Jul}, publisher = {PMLR}, pdf = {https://proceedings.mlr.press/v202/bartan23a/bartan23a.pdf}, url = {https://proceedings.mlr.press/v202/bartan23a.html}, abstract = {The deployment and training of neural networks on edge computing devices pose many challenges. The low memory nature of edge devices is often one of the biggest limiting factors encountered in the deployment of large neural network models. Tensor rematerialization or recompute is a way to address high memory requirements for neural network training and inference. In this paper we consider the problem of execution time minimization of compute graphs subject to a memory budget. In particular, we develop a new constraint programming formulation called Moccasin with only $O(n)$ integer variables, where $n$ is the number of nodes in the compute graph. This is a significant improvement over the works in the recent literature that propose formulations with $O(n^2)$ Boolean variables. We present numerical studies that show that our approach is up to an order of magnitude faster than recent work especially for large-scale graphs.} }
Endnote
%0 Conference Paper %T Moccasin: Efficient Tensor Rematerialization for Neural Networks %A Burak Bartan %A Haoming Li %A Harris Teague %A Christopher Lott %A Bistra Dilkina %B Proceedings of the 40th International Conference on Machine Learning %C Proceedings of Machine Learning Research %D 2023 %E Andreas Krause %E Emma Brunskill %E Kyunghyun Cho %E Barbara Engelhardt %E Sivan Sabato %E Jonathan Scarlett %F pmlr-v202-bartan23a %I PMLR %P 1826--1837 %U https://proceedings.mlr.press/v202/bartan23a.html %V 202 %X The deployment and training of neural networks on edge computing devices pose many challenges. The low memory nature of edge devices is often one of the biggest limiting factors encountered in the deployment of large neural network models. Tensor rematerialization or recompute is a way to address high memory requirements for neural network training and inference. In this paper we consider the problem of execution time minimization of compute graphs subject to a memory budget. In particular, we develop a new constraint programming formulation called Moccasin with only $O(n)$ integer variables, where $n$ is the number of nodes in the compute graph. This is a significant improvement over the works in the recent literature that propose formulations with $O(n^2)$ Boolean variables. We present numerical studies that show that our approach is up to an order of magnitude faster than recent work especially for large-scale graphs.
APA
Bartan, B., Li, H., Teague, H., Lott, C. & Dilkina, B.. (2023). Moccasin: Efficient Tensor Rematerialization for Neural Networks. Proceedings of the 40th International Conference on Machine Learning, in Proceedings of Machine Learning Research 202:1826-1837 Available from https://proceedings.mlr.press/v202/bartan23a.html.

Related Material