External libraries

Hummingbird

Hummingbird is a third-party, open-source library that converts machine learning models into tensor computations, and it can export these models to ONNX. The list of supported models can be found in the Hummingbird documentation.

Concrete ML allows the conversion of an ONNX inference to NumPy inference (note that NumPy is always the entry point to run models in FHE with Concrete ML).

Hummingbird exposes a convert function that can be imported as follows from the hummingbird.ml package:

# Disable Hummingbird warnings for pytest.
import warnings
warnings.filterwarnings("ignore")
from hummingbird.ml import convert

This function can be used to convert a machine learning model to an ONNX as follows:

from sklearn.datasets import make_classification
from sklearn.linear_model import LogisticRegression

# Instantiate the logistic regression from sklearn
model = LogisticRegression()

# Create synthetic data
X, y = make_classification(
    n_samples=100, n_features=20, n_classes=2
)

# Fit the model
model.fit(X, y)

# Convert the model to ONNX
onnx_model = convert(model, backend="onnx", test_input=X).model

In theory, the resulting onnx_model could be used directly within Concrete ML's get_equivalent_numpy_forward method (as long as all operators present in the ONNX model are implemented in NumPy) and get the NumPy inference.

In practice, there are some steps needed to clean the ONNX output and make the graph compatible with Concrete ML, such as applying quantization where needed or deleting/replacing non-FHE friendly ONNX operators (such as Softmax and ArgMax).

skorch

Concrete ML uses skorch to implement multi-layer, fully-connected PyTorch neural networks in a way that is compatible with the scikit-learn API.

This wrapper implements Torch training boilerplate code, lessening the work required of the user. It is possible to add hooks during the training phase, for example once an epoch is finished.

skorch allows the user to easily create a classifier or regressor around a neural network (NN), implemented in Torch as a nn.Module, which is used by Concrete ML to provide a fully-connected, multi-layer NN with a configurable number of layers and optional pruning (see pruning and the neural network documentation for more information).

Under the hood, Concrete ML uses a skorch wrapper around a single PyTorch module, SparseQuantNeuralNetwork. More information can be found in the API guide.

class SparseQuantNeuralNetImpl(nn.Module):
    """Sparse Quantized Neural Network classifier.

Brevitas

Brevitas is a quantization aware learning toolkit built on top of PyTorch. It provides quantization layers that are one-to-one equivalents to PyTorch layers, but also contain operations that perform the quantization during training.

While Brevitas provides many types of quantization, for Concrete ML, a custom "mixed integer" quantization applies. This "mixed integer" quantization is much simpler than the "integer only" mode of Brevitas. The "mixed integer" network design is defined as:

  • all weights and activations of convolutional, linear and pooling layers must be quantized (e.g., using Brevitas layers, QuantConv2D, QuantAvgPool2D, QuantLinear)

  • PyTorch floating-point versions of univariate functions can be used (e.g., torch.relu, nn.BatchNormalization2D, torch.max (encrypted vs. constant), torch.add, torch.exp). See the PyTorch supported layers page for a full list.

The "mixed integer" mode used in Concrete ML neural networks is based on the "integer only" Brevitas quantization that makes both weights and activations representable as integers during training. However, through the use of lookup tables in Concrete ML, floating point univariate PyTorch functions are supported.

For "mixed integer" quantization to work, the first layer of a Brevitas nn.Module must be a QuantIdentity layer. However, you can then use functions such as torch.sigmoid on the result of such a quantizing operation.

import torch.nn as nn

class QATnetwork(nn.Module):
    def __init__(self):
        super(QATnetwork, self).__init__()
        self.quant_inp = qnn.QuantIdentity(
            bit_width=4, return_quant_tensor=True)
        # ...

    def forward(self, x):
        out = self.quant_inp(x)
        return torch.sigmoid(out)
        # ...

For examples of such a "mixed integer" network design, please see the Quantization Aware Training examples:

You can also refer to the SparseQuantNeuralNetImpl class, which is the basis of the built-in NeuralNetworkClassifier.

Last updated