Importing ONNX
Internally, Concrete-ML uses ONNX operators as intermediate representation (or IR) for manipulating machine learning models produced through export for PyTorch, Hummingbird and skorch.
As ONNX is becoming the standard exchange format for neural networks, this allows Concrete-ML to be flexible while also making model representation manipulation quite easy. In addition, it allows for straight-forward mapping to NumPy operators, supported by Concrete-Numpy to use Concrete stack's FHE conversion capabilities.
Torch to NumPy conversion using ONNX
The diagram below gives an overview of the steps involved in the conversion of an ONNX graph to a FHE compatible format, i.e. a format that can be compiled to FHE through Concrete-Numpy.
All Concrete-ML built-in models follow the same pattern for FHE conversion:
The models are trained with sklearn or PyTorch
All models have a PyTorch implementation for inference. This implementation is provided either by a third-party tool such as Hummingbird or implemented directly in Concrete-ML.
The PyTorch model is exported to ONNX. For more information on the use of ONNX in Concrete-ML, see here.
The Concrete-ML ONNX parser checks that all the operations in the ONNX graph are supported and assigns reference NumPy operations to them. This step produces a
NumpyModule
.Quantization is performed on the
NumpyModule
, producing aQuantizedModule
. Two steps are performed: calibration and assignment of equivalentQuantizedOp
objects to each ONNX operation. TheQuantizedModule
class is the quantized counterpart of theNumpyModule
.Once the
QuantizedModule
is built, Concrete-Numpy is used to trace the._forward()
function of theQuantizedModule
.
Moreover, by passing a user provided nn.Module
to step 2 of the above process, Concrete-ML supports custom user models. See the associated FHE-friendly model documentation for instructions about working with such models.
Once an ONNX model is imported, it is converted to a NumpyModule
, then to a QuantizedModule
and, finally, to a FHE circuit. However, as the diagram shows, it is perfectly possible to stop at the NumpyModule
level if you just want to run the PyTorch model as NumPy code without doing quantization.
Note that the NumpyModule
interpreter currently supports the following ONNX operators.
Inspecting the ONNX models
In order to better understand how Concrete-ML works under the hood, it is possible to access each model in their ONNX format and then either print it or visualize it by importing the associated file in Netron. For example, with LogisticRegression
:
Last updated