Concrete ML
WebsiteLibrariesProducts & ServicesDevelopersSupport
1.7
1.7
  • Welcome
  • Get Started
    • What is Concrete ML?
    • Installation
    • Key concepts
    • Inference in the cloud
  • Built-in Models
    • Linear models
    • Tree-based models
    • Neural networks
    • Nearest neighbors
    • Encrypted dataframe
    • Encrypted training
  • Deep Learning
    • Using Torch
    • Using ONNX
    • Step-by-step guide
    • Debugging models
    • Optimizing inference
    • Encrypted fine-tuning
  • Guides
    • Prediction with FHE
    • Production deployment
    • Hybrid models
    • Serialization
    • GPU acceleration
  • Tutorials
    • See all tutorials
    • Built-in model examples
    • Deep learning examples
  • References
    • API
  • Explanations
    • Security and correctness
    • Quantization
    • Pruning
    • Compilation
    • Advanced features
    • Project architecture
      • Importing ONNX
      • Quantization tools
      • FHE Op-graph design
      • External libraries
  • Developers
    • Set up the project
    • Set up Docker
    • Documentation
    • Support and issues
    • Contributing
    • Support new ONNX node
    • Release note
    • Feature request
    • Bug report
Powered by GitBook

Libraries

  • TFHE-rs
  • Concrete
  • Concrete ML
  • fhEVM

Developers

  • Blog
  • Documentation
  • Github
  • FHE resources

Company

  • About
  • Introduction to FHE
  • Media
  • Careers
On this page
  • Simple example
  • Quantization Aware Training
  • Supported operators

Was this helpful?

Export as PDF
  1. Deep Learning

Using ONNX

PreviousUsing TorchNextStep-by-step guide

Last updated 8 months ago

Was this helpful?

This document explains how to compile models in Concrete ML. This is particularly useful for importing models trained with Keras.

You can compile ONNX models by directly importing models that are already quantized with or by performing with Concrete ML.

Simple example

The following example shows how to compile an ONNX model using PTQ. The model was initially trained using Keras before being exported to ONNX. The training code is not shown here.

This example uses PTQ, meaning that the quantization is not performed during training. This model does not have the optimal performance in FHE.

To improve performance in FHE, you should add QAT. Additionally, you can also import QAT ONNX models .

import numpy
import onnx

from concrete.ml.torch.compile import compile_onnx_model
from concrete.fhe.compilation import Configuration



n_bits = 6
input_output_feature = 5
input_shape = (input_output_feature,)
num_inputs = 1
n_examples = 5000

# Create random input
input_set = numpy.random.uniform(-100, 100, size=(n_examples, *input_shape))

onnx_model = onnx.load(f"tests/data/tf_onnx/fc_{input_output_feature}.onnx")
onnx.checker.check_model(onnx_model)

# Compile
quantized_module = compile_onnx_model(
    onnx_model, input_set, n_bits=2
)

# Create test data from the same distribution and quantize using
# learned quantization parameters during compilation
x_test = tuple(numpy.random.uniform(-100, 100, size=(1, *input_shape)) for _ in range(num_inputs))

y_clear = quantized_module.forward(*x_test, fhe="disable")
y_fhe = quantized_module.forward(*x_test, fhe="execute")

print("Execution in clear: ", y_clear)
print("Execution in FHE:   ", y_fhe)
print("Equality:           ", numpy.sum(y_clear == y_fhe), "over", numpy.size(y_fhe), "values")

While a Keras ONNX model was used in this example, Keras/Tensorflow support in Concrete ML is only partial and experimental.

Quantization Aware Training

Models trained using QAT contain quantizers in the ONNX graph. These quantizers ensure that the inputs to the Linear/Dense and Conv layers are quantized. Since these QAT models have quantizers configured to a specific number of bits during training, you must import the ONNX graph using the same settings:

# Define the number of bits to use for quantizing weights and activations during training
n_bits_qat = 3  

quantized_numpy_module = compile_onnx_model(
    onnx_model,
    input_set,
    import_qat=True,
    n_bits=n_bits_qat,
)

Supported operators

Concrete ML supports the following operators for evaluation and conversion to an equivalent FHE circuit. Other operators were not implemented either due to FHE constraints or because they are rarely used in PyTorch activations or scikit-learn models.

  • Abs

  • Acos

  • Acosh

  • Add

  • Asin

  • Asinh

  • Atan

  • Atanh

  • AveragePool

  • BatchNormalization

  • Cast

  • Celu

  • Clip

  • Concat

  • Constant

  • ConstantOfShape

  • Conv

  • Cos

  • Cosh

  • Div

  • Elu

  • Equal

  • Erf

  • Exp

  • Expand

  • Flatten

  • Floor

  • Gather

  • Gemm

  • Greater

  • GreaterOrEqual

  • HardSigmoid

  • HardSwish

  • Identity

  • LeakyRelu

  • Less

  • LessOrEqual

  • Log

  • MatMul

  • Max

  • MaxPool

  • Min

  • Mul

  • Neg

  • Not

  • OneHot

  • Or

  • PRelu

  • Pad

  • Pow

  • ReduceSum

  • Relu

  • Reshape

  • Round

  • Selu

  • Shape

  • Sigmoid

  • Sign

  • Sin

  • Sinh

  • Slice

  • Softplus

  • Squeeze

  • Sub

  • Tan

  • Tanh

  • ThresholdedRelu

  • Transpose

  • Unfold

  • Unsqueeze

  • Where

  • onnx.brevitas.Quant

ONNX
as shown below
Quantization Aware Training (QAT)
Post Training Quantization (PTQ)