Concrete ML
WebsiteLibrariesProducts & ServicesDevelopersSupport
1.1
1.1
  • What is Concrete ML?
  • Getting Started
    • Installation
    • Key Concepts
    • Inference in the Cloud
    • Demos and Tutorials
  • Built-in Models
    • Linear Models
    • Tree-based Models
    • Neural Networks
    • Pandas
    • Built-in Model Examples
  • Deep Learning
    • Using Torch
    • Using ONNX
    • Step-by-step Guide
    • Deep Learning Examples
    • Debugging Models
    • Optimizing Inference
  • Advanced topics
    • Quantization
    • Pruning
    • Compilation
    • Prediction with FHE
    • Production Deployment
    • Advanced Features
    • Serialization
  • Developer Guide
    • Workflow
      • Set Up the Project
      • Set Up Docker
      • Documentation
      • Support and Issues
      • Contributing
    • Inner Workings
      • Importing ONNX
      • Quantization Tools
      • FHE Op-graph Design
      • External Libraries
    • API
      • concrete.ml.common.check_inputs.md
      • concrete.ml.common.debugging.custom_assert.md
      • concrete.ml.common.debugging.md
      • concrete.ml.common.md
      • concrete.ml.common.serialization.decoder.md
      • concrete.ml.common.serialization.dumpers.md
      • concrete.ml.common.serialization.encoder.md
      • concrete.ml.common.serialization.loaders.md
      • concrete.ml.common.serialization.md
      • concrete.ml.common.utils.md
      • concrete.ml.deployment.deploy_to_aws.md
      • concrete.ml.deployment.deploy_to_docker.md
      • concrete.ml.deployment.fhe_client_server.md
      • concrete.ml.deployment.md
      • concrete.ml.deployment.server.md
      • concrete.ml.deployment.utils.md
      • concrete.ml.onnx.convert.md
      • concrete.ml.onnx.md
      • concrete.ml.onnx.onnx_impl_utils.md
      • concrete.ml.onnx.onnx_model_manipulations.md
      • concrete.ml.onnx.onnx_utils.md
      • concrete.ml.onnx.ops_impl.md
      • concrete.ml.pytest.md
      • concrete.ml.pytest.torch_models.md
      • concrete.ml.pytest.utils.md
      • concrete.ml.quantization.base_quantized_op.md
      • concrete.ml.quantization.md
      • concrete.ml.quantization.post_training.md
      • concrete.ml.quantization.quantized_module.md
      • concrete.ml.quantization.quantized_ops.md
      • concrete.ml.quantization.quantizers.md
      • concrete.ml.search_parameters.md
      • concrete.ml.search_parameters.p_error_search.md
      • concrete.ml.sklearn.base.md
      • concrete.ml.sklearn.glm.md
      • concrete.ml.sklearn.linear_model.md
      • concrete.ml.sklearn.md
      • concrete.ml.sklearn.qnn.md
      • concrete.ml.sklearn.qnn_module.md
      • concrete.ml.sklearn.rf.md
      • concrete.ml.sklearn.svm.md
      • concrete.ml.sklearn.tree.md
      • concrete.ml.sklearn.tree_to_numpy.md
      • concrete.ml.sklearn.xgb.md
      • concrete.ml.torch.compile.md
      • concrete.ml.torch.md
      • concrete.ml.torch.numpy_module.md
      • concrete.ml.version.md
Powered by GitBook

Libraries

  • TFHE-rs
  • Concrete
  • Concrete ML
  • fhEVM

Developers

  • Blog
  • Documentation
  • Github
  • FHE resources

Company

  • About
  • Introduction to FHE
  • Media
  • Careers
On this page
  • module concrete.ml.torch.compile
  • Global Variables
  • function convert_torch_tensor_or_numpy_array_to_numpy_array
  • function build_quantized_module
  • function compile_torch_model
  • function compile_onnx_model
  • function compile_brevitas_qat_model

Was this helpful?

Export as PDF
  1. Developer Guide
  2. API

concrete.ml.torch.compile.md

Previousconcrete.ml.sklearn.xgb.mdNextconcrete.ml.torch.md

Last updated 1 year ago

Was this helpful?

module concrete.ml.torch.compile

torch compilation function.

Global Variables

  • MAX_BITWIDTH_BACKWARD_COMPATIBLE

  • OPSET_VERSION_FOR_ONNX_EXPORT


function convert_torch_tensor_or_numpy_array_to_numpy_array

convert_torch_tensor_or_numpy_array_to_numpy_array(
    torch_tensor_or_numpy_array: Union[Tensor, ndarray]
) → ndarray

Convert a torch tensor or a numpy array to a numpy array.

Args:

  • torch_tensor_or_numpy_array (Tensor): the value that is either a torch tensor or a numpy array.

Returns:

  • numpy.ndarray: the value converted to a numpy array.


function build_quantized_module

build_quantized_module(
    model: Union[Module, ModelProto],
    torch_inputset: Union[Tensor, ndarray, Tuple[Union[Tensor, ndarray], ]],
    import_qat: bool = False,
    n_bits=8,
    rounding_threshold_bits: Optional[int] = None
) → QuantizedModule

Build a quantized module from a Torch or ONNX model.

Take a model in torch or ONNX, turn it to numpy, quantize its inputs / weights / outputs and retrieve the associated quantized module.

Args:

  • model (Union[torch.nn.Module, onnx.ModelProto]): The model to quantize, either in torch or in ONNX.

  • torch_inputset (Dataset): the calibration input-set, can contain either torch tensors or numpy.ndarray

  • import_qat (bool): Flag to signal that the network being imported contains quantizers in in its computation graph and that Concrete ML should not re-quantize it

  • n_bits: the number of bits for the quantization

  • rounding_threshold_bits (int): if not None, every accumulators in the model are rounded down to the given bits of precision

Returns:

  • QuantizedModule: The resulting QuantizedModule.


function compile_torch_model

compile_torch_model(
    torch_model: Module,
    torch_inputset: Union[Tensor, ndarray, Tuple[Union[Tensor, ndarray], ]],
    import_qat: bool = False,
    configuration: Optional[Configuration] = None,
    artifacts: Optional[DebugArtifacts] = None,
    show_mlir: bool = False,
    n_bits=8,
    rounding_threshold_bits: Optional[int] = None,
    p_error: Optional[float] = None,
    global_p_error: Optional[float] = None,
    verbose: bool = False
) → QuantizedModule

Compile a torch module into an FHE equivalent.

Take a model in torch, turn it to numpy, quantize its inputs / weights / outputs and finally compile it with Concrete

Args:

  • torch_model (torch.nn.Module): the model to quantize

  • torch_inputset (Dataset): the calibration input-set, can contain either torch tensors or numpy.ndarray.

  • import_qat (bool): Set to True to import a network that contains quantizers and was trained using quantization aware training

  • configuration (Configuration): Configuration object to use during compilation

  • artifacts (DebugArtifacts): Artifacts object to fill during compilation

  • show_mlir (bool): if set, the MLIR produced by the converter and which is going to be sent to the compiler backend is shown on the screen, e.g., for debugging or demo

  • n_bits: the number of bits for the quantization

  • rounding_threshold_bits (int): if not None, every accumulators in the model are rounded down to the given bits of precision

  • p_error (Optional[float]): probability of error of a single PBS

  • global_p_error (Optional[float]): probability of error of the full circuit. In FHE simulation global_p_error is set to 0

  • verbose (bool): whether to show compilation information

Returns:

  • QuantizedModule: The resulting compiled QuantizedModule.


function compile_onnx_model

compile_onnx_model(
    onnx_model: ModelProto,
    torch_inputset: Union[Tensor, ndarray, Tuple[Union[Tensor, ndarray], ]],
    import_qat: bool = False,
    configuration: Optional[Configuration] = None,
    artifacts: Optional[DebugArtifacts] = None,
    show_mlir: bool = False,
    n_bits=8,
    rounding_threshold_bits: Optional[int] = None,
    p_error: Optional[float] = None,
    global_p_error: Optional[float] = None,
    verbose: bool = False
) → QuantizedModule

Compile a torch module into an FHE equivalent.

Take a model in torch, turn it to numpy, quantize its inputs / weights / outputs and finally compile it with Concrete-Python

Args:

  • onnx_model (onnx.ModelProto): the model to quantize

  • torch_inputset (Dataset): the calibration input-set, can contain either torch tensors or numpy.ndarray.

  • import_qat (bool): Flag to signal that the network being imported contains quantizers in in its computation graph and that Concrete ML should not re-quantize it.

  • configuration (Configuration): Configuration object to use during compilation

  • artifacts (DebugArtifacts): Artifacts object to fill during compilation

  • show_mlir (bool): if set, the MLIR produced by the converter and which is going to be sent to the compiler backend is shown on the screen, e.g., for debugging or demo

  • n_bits: the number of bits for the quantization

  • rounding_threshold_bits (int): if not None, every accumulators in the model are rounded down to the given bits of precision

  • p_error (Optional[float]): probability of error of a single PBS

  • global_p_error (Optional[float]): probability of error of the full circuit. In FHE simulation global_p_error is set to 0

  • verbose (bool): whether to show compilation information

Returns:

  • QuantizedModule: The resulting compiled QuantizedModule.


function compile_brevitas_qat_model

compile_brevitas_qat_model(
    torch_model: Module,
    torch_inputset: Union[Tensor, ndarray, Tuple[Union[Tensor, ndarray], ]],
    n_bits: Optional[int, dict] = None,
    configuration: Optional[Configuration] = None,
    artifacts: Optional[DebugArtifacts] = None,
    show_mlir: bool = False,
    rounding_threshold_bits: Optional[int] = None,
    p_error: Optional[float] = None,
    global_p_error: Optional[float] = None,
    output_onnx_file: Union[Path, str] = None,
    verbose: bool = False
) → QuantizedModule

Compile a Brevitas Quantization Aware Training model.

The torch_model parameter is a subclass of torch.nn.Module that uses quantized operations from brevitas.qnn. The model is trained before calling this function. This function compiles the trained model to FHE.

Args:

  • torch_model (torch.nn.Module): the model to quantize

  • torch_inputset (Dataset): the calibration input-set, can contain either torch tensors or numpy.ndarray.

  • n_bits (Optional[Union[int, dict]): the number of bits for the quantization. By default, for most models, a value of None should be given, which instructs Concrete ML to use the bit-widths configured using Brevitas quantization options. For some networks, that perform a non-linear operation on an input on an output, if None is given, a default value of 8 bits is used for the input/output quantization. For such models the user can also specify a dictionary with model_inputs/model_outputs keys to override the 8-bit default or a single integer for both values.

  • configuration (Configuration): Configuration object to use during compilation

  • artifacts (DebugArtifacts): Artifacts object to fill during compilation

  • show_mlir (bool): if set, the MLIR produced by the converter and which is going to be sent to the compiler backend is shown on the screen, e.g., for debugging or demo

  • rounding_threshold_bits (int): if not None, every accumulators in the model are rounded down to the given bits of precision

  • p_error (Optional[float]): probability of error of a single PBS

  • global_p_error (Optional[float]): probability of error of the full circuit. In FHE simulation global_p_error is set to 0

  • output_onnx_file (str): temporary file to store ONNX model. If None a temporary file is generated

  • verbose (bool): whether to show compilation information

Returns:

  • QuantizedModule: The resulting compiled QuantizedModule.