RuntimeOpr instructions#

RuntimeOpr refers to embedding offline models supported by other hardware vendors into the MegEngine Graph as an operator through MegEngine.

Warning

Models containing RuntimeOpr cannot save weights through:py:func:megengine.save, but can only be directly saved as models through trace.dump. See Serialization and deserialization for usage.

Currently, there are three types of RuntimeOpr supported: TensorRT, Atlas, and Cambricon. The model that includes RuntimeOpr needs to perform inference tasks on the corresponding hardware platform. Let’s take Atlas as an example to show the usage (the interfaces of TensorRT and Cambricon are similar to it):

The model contains only one RuntimeOpr#

import numpy as np
import megengine as mge
from megengine.module.external import AtlasRuntimeSubgraph

with open("AtlasRuntimeOprTest.om", "rb") as f:
  data = f.read()

m = AtlasRuntimeSubgraph(data)
inp = mge.tensor(np.ones((4, 3, 16, 16)).astype(np.float32), device="atlas0")

y = m(inp)

Note

  1. The model file of the hardware manufacturer needs to be opened as a byte stream

  2. The belonging device entered by RuntimeOpr should be this type of device. In this example, the device of inp is “atlas0”

RuntimeOpr as part of the model#

import megengine as mge
import megengine.module as M
import megengine.functional as F

class Net(M.Module):
 def __init__(self, data):
     super().__init__()
     self.runtimeopr = AtlasRuntimeSubgraph(data)

 def forward(self, x):
     out = F.relu(x)
     # out = out.astype(np.float16)
     out = F.copy(out, "atlas0")
     out = self.runtimeopr(out)[0]
     out = F.copy(out, "cpux")
     out = F.relu(out)
     return out

m = Net(data)
inp = Tensor(np.ones(shape=(1, 64, 32, 32)).astype(np.float32), device="cpux")
y = m(inp)

Note

  1. Before and after RuntimeOpr, you must use: py:func:~.copy to copy Tensor from CPU to Atlas, or from Atlas to CPU, otherwise an error will be reported because CompNode does not meet the specifications;

  2. If you need to change the data type, please complete it on the CPU (refer to the code above);

  3. You can only copy from the CPU to other devices or vice versa. You cannot directly copy between various devices, such as GPU to Atlas.

Serialization and deserialization#

Refer to the code below:

import io
from megengine.jit import trace
import megengine.utils.comp_graph_tools as cgtools

def func(inp):
  feature = m(inp)
  return feature

traced_func = trace(func, symbolic=True, capture_as_const=True)
y2 = traced_func(inp)
file = io.BytesIO()
traced_func.dump(file)
file.seek(0)
infer_cg = cgtools.GraphInference(file)
y3 = list((infer_cg.run(inp.numpy())).values())[0]
np.testing.assert_almost_equal(y2.numpy(), y3)