ProGraML: A Graph-based Program Representation for Data Flow Analysis and Compiler Optimizations

Chris Cummins, Zacharias V. Fisches, Tal Ben-Nun, Torsten Hoefler, Michael F P O’Boyle, Hugh Leather
Proceedings of the 38th International Conference on Machine Learning, PMLR 139:2244-2253, 2021.

Abstract

Machine learning (ML) is increasingly seen as a viable approach for building compiler optimization heuristics, but many ML methods cannot replicate even the simplest of the data flow analyses that are critical to making good optimization decisions. We posit that if ML cannot do that, then it is insufficiently able to reason about programs. We formulate data flow analyses as supervised learning tasks and introduce a large open dataset of programs and their corresponding labels from several analyses. We use this dataset to benchmark ML methods and show that they struggle on these fundamental program reasoning tasks. We propose ProGraML - Program Graphs for Machine Learning - a language-independent, portable representation of program semantics. ProGraML overcomes the limitations of prior works and yields improved performance on downstream optimization tasks.

Cite this Paper


BibTeX
@InProceedings{pmlr-v139-cummins21a, title = {ProGraML: A Graph-based Program Representation for Data Flow Analysis and Compiler Optimizations}, author = {Cummins, Chris and Fisches, Zacharias V. and Ben-Nun, Tal and Hoefler, Torsten and O'Boyle, Michael F P and Leather, Hugh}, booktitle = {Proceedings of the 38th International Conference on Machine Learning}, pages = {2244--2253}, year = {2021}, editor = {Meila, Marina and Zhang, Tong}, volume = {139}, series = {Proceedings of Machine Learning Research}, month = {18--24 Jul}, publisher = {PMLR}, pdf = {http://proceedings.mlr.press/v139/cummins21a/cummins21a.pdf}, url = {https://proceedings.mlr.press/v139/cummins21a.html}, abstract = {Machine learning (ML) is increasingly seen as a viable approach for building compiler optimization heuristics, but many ML methods cannot replicate even the simplest of the data flow analyses that are critical to making good optimization decisions. We posit that if ML cannot do that, then it is insufficiently able to reason about programs. We formulate data flow analyses as supervised learning tasks and introduce a large open dataset of programs and their corresponding labels from several analyses. We use this dataset to benchmark ML methods and show that they struggle on these fundamental program reasoning tasks. We propose ProGraML - Program Graphs for Machine Learning - a language-independent, portable representation of program semantics. ProGraML overcomes the limitations of prior works and yields improved performance on downstream optimization tasks.} }
Endnote
%0 Conference Paper %T ProGraML: A Graph-based Program Representation for Data Flow Analysis and Compiler Optimizations %A Chris Cummins %A Zacharias V. Fisches %A Tal Ben-Nun %A Torsten Hoefler %A Michael F P O’Boyle %A Hugh Leather %B Proceedings of the 38th International Conference on Machine Learning %C Proceedings of Machine Learning Research %D 2021 %E Marina Meila %E Tong Zhang %F pmlr-v139-cummins21a %I PMLR %P 2244--2253 %U https://proceedings.mlr.press/v139/cummins21a.html %V 139 %X Machine learning (ML) is increasingly seen as a viable approach for building compiler optimization heuristics, but many ML methods cannot replicate even the simplest of the data flow analyses that are critical to making good optimization decisions. We posit that if ML cannot do that, then it is insufficiently able to reason about programs. We formulate data flow analyses as supervised learning tasks and introduce a large open dataset of programs and their corresponding labels from several analyses. We use this dataset to benchmark ML methods and show that they struggle on these fundamental program reasoning tasks. We propose ProGraML - Program Graphs for Machine Learning - a language-independent, portable representation of program semantics. ProGraML overcomes the limitations of prior works and yields improved performance on downstream optimization tasks.
APA
Cummins, C., Fisches, Z.V., Ben-Nun, T., Hoefler, T., O’Boyle, M.F.P. & Leather, H.. (2021). ProGraML: A Graph-based Program Representation for Data Flow Analysis and Compiler Optimizations. Proceedings of the 38th International Conference on Machine Learning, in Proceedings of Machine Learning Research 139:2244-2253 Available from https://proceedings.mlr.press/v139/cummins21a.html.

Related Material