Hummingbird is a third-party, open-source library that converts machine learning models into tensor computations, and it can export these models to ONNX. The list of supported models can be found in the Hummingbird documentation.
Concrete-ML allows the conversion of an ONNX inference to NumPy inference (note that NumPy is always the entry point to run models in FHE with Concrete-ML).
Hummingbird exposes a convert
function that can be imported as follows from the hummingbird.ml
package:
This function can be used to convert a machine learning model to an ONNX as follows:
In theory, the resulting onnx_model
could be used directly within Concrete-ML's get_equivalent_numpy_forward
method (as long as all operators present in the ONNX model are implemented in NumPy) and get the NumPy inference.
In practice, there are some steps needed to clean the ONNX output and make the graph compatible with Concrete-ML, such as applying quantization where needed or deleting/replacing non-FHE friendly ONNX operators (such as Softmax and ArgMax).
Concrete-ML uses Skorch to implement multi-layer, fully-connected PyTorch neural networks in a way that is compatible with the scikit-learn API.
This wrapper implements Torch training boilerplate code, lessening the work required of the user. It is possible to add hooks during the training phase, for example once an epoch is finished.
Skorch allows the user to easily create a classifier or regressor around a neural network (NN), implemented in Torch as a nn.Module
, which is used by Concrete-ML to provide a fully-connected, multi-layer NN with a configurable number of layers and optional pruning (see pruning and the neural network documentation for more information).
Under the hood, Concrete-ML uses a Skorch wrapper around a single PyTorch module, SparseQuantNeuralNetImpl
. More information can be found in the API guide.
Brevitas is a quantization aware learning toolkit built on top of PyTorch. It provides quantization layers that are one-to-one equivalents to PyTorch layers, but also contain operations that perform the quantization during training.
While Brevitas provides many types of quantization, for Concrete-ML, a custom "mixed integer" quantization applies. This "mixed integer" quantization is much simpler than the "integer only" mode of Brevitas. The "mixed integer" network design is defined as:
all weights and activations of convolutional, linear and pooling layers must be quantized (e.g. using Brevitas layers, QuantConv2D
, QuantAvgPool2D
, QuantLinear
)
PyTorch floating-point versions of univariate functions can be used. E.g. torch.relu
, nn.BatchNormalization2D
, torch.max
(encrypted vs. constant), torch.add
, torch.exp
. See the PyTorch supported layers page for a full list.
The "mixed integer" mode used in Concrete-ML neural networks is based on the "integer only" Brevitas quantization that makes both weights and activations representable as integers during training. However, through the use of lookup tables in Concrete-ML, floating point univariate PyTorch functions are supported.
For "mixed integer" quantization to work, the first layer of a Brevitas nn.Module
must be a QuantIdentity
layer. However, you can then use functions such as torch.sigmoid
on the result of such a quantizing operation.
For examples of such a "mixed integer" network design, please see the Quantization Aware Training examples:
or go to the MNIST use-case example.
You can also refer to the SparseQuantNeuralNetImpl
class, which is the basis of the built-in NeuralNetworkClassifier
.