DyCL: Dynamic Neural Network Compilation Via Program Rewriting and Graph Optimization

Abstract

The deep learning (DL) compiler is a fundamental piece of infrastructure that enables the deployment of deep neural networks on various hardware platforms (e.g, mobile devices and Raspberry Pi). A DL compiler translates DNN programs written with high-level DL frameworks (e.g, PyTorch and TensorFlow) into portable executables; deployed host programs can then flexibly run these executables. Existing DL compilers treat neural network programs as static data flow graphs, which presume a pre-determined DNN model architecture. However, this assumption does not hold in modern dynamic neural networks (DyNNs). As a result, existing DL compilers cannot compile DyNNs into correct executables. To bridge this gap, we propose DyCL, a flexible approach that enables existing DL compilers for compiling DyNNs. DyCL handles the dynamic nature of the DyNNs by introducing a compilation mechanism that redistributes the original programs’ control and data flow during compilation. Specifically, DyCL applies program analysis and transformation techniques to transform an dynamic neural network into multiple sub-neural networks. Each sub-neural network does not contain conditional statements and is compiled separately. DyCL then synthesizes a host API to model the control flow of the DyNNs and invocations of the sub-neural networks. Our evaluation demonstrates that DyCL can 100% successfully compile all dynamic neural networks and the compiled executables run 1.12X to 20.21X faster than the original DyNNs runs on the general-purpose DL frameworks.

Publication
In The ACM SIGSOFT International Symposium on Software Testing and Analysis.
Date