Using Torch
Last updated
Was this helpful?
Last updated
Was this helpful?
In addition to the built-in models, Concrete-ML supports generic machine learning models implemented with Torch, or .
As is the most appropriate method of training neural networks that are compatible with , Concrete-ML works with , a library providing QAT support for PyTorch.
The following example uses a simple QAT PyTorch model that implements a fully connected neural network with two hidden layers. Due to its small size, making this model respect FHE constraints is relatively easy.
The model can now be used to perform encrypted inference. Next, the test data is quantized:
and the encrypted inference can be run using either:
quantized_numpy_module.forward_and_dequant()
to compute predictions in the clear on quantized data, and then de-quantize the result. The return value of this function contains the dequantized (float) output of running the model in the clear. Calling the forward function on the clear data is useful when debugging. The results in FHE will be the same as those on clear quantized data.
quantized_numpy_module.forward_fhe.encrypt_run_decrypt()
to perform the FHE inference. In this case, de-quantization is done in a second stage using quantized_numpy_module.dequantize_output()
.
While the example above shows how to import a Brevitas/PyTorch model, Concrete-ML also provides an option to import generic QAT models implemented either in PyTorch or through ONNX. Interestingly, deep learning models made with TensorFlow or Keras should be usable, by preliminary converting them to ONNX.
QAT models contain quantizers in the PyTorch graph. These quantizers ensure that the inputs to the Linear/Dense and Conv layers are quantized.
When importing QAT models using this generic pipeline, a representative calibration set should be given as quantization parameters in the model need to be inferred from the statistics of the values encountered during inference.
Concrete-ML supports a variety of PyTorch operators that can be used to build fully connected or convolutional neural networks, with normalization and activation layers. Moreover, many element-wise operators are supported.
Please note that Concrete-ML supports these operators but also the QAT equivalents from Brevitas.
brevitas.nn.QuantLinear
brevitas.nn.QuantConv2d
brevitas.nn.QuantIdentity
Once the model is trained, calling the from Concrete-ML will automatically perform conversion and compilation of a QAT network. Here, 3-bit quantization is used for both the weights and activations.
Suppose that n_bits_qat
is the bit-width of activations and weights during the QAT process. To import a PyTorch QAT network, you can use the library function, passing import_qat=True
:
Alternatively, if you want to import an ONNX model directly, please see . The also supports the import_qat
parameter.
-- partial support