1 of 1

concrete.ml.torch.compile.md

module `concrete.ml.torch.compile`

torch compilation function.

Global Variables

MAX_BITWIDTH_BACKWARD_COMPATIBLE
OPSET_VERSION_FOR_ONNX_EXPORT

function `has_any_qnn_layers`

has_any_qnn_layers(torch_model: Module) → bool

Check if a torch model has QNN layers.

This is useful to check if a model is a QAT model.

Args:

torch_model (torch.nn.Module): a torch model

Returns:

bool: whether this torch model contains any QNN layer.

function `convert_torch_tensor_or_numpy_array_to_numpy_array`

convert_torch_tensor_or_numpy_array_to_numpy_array(
    torch_tensor_or_numpy_array: Union[Tensor, ndarray]
) → ndarray

Convert a torch tensor or a numpy array to a numpy array.

Args:

torch_tensor_or_numpy_array (Tensor): the value that is either a torch tensor or a numpy array.

Returns:

numpy.ndarray: the value converted to a numpy array.

function `build_quantized_module`

build_quantized_module(
    model: Union[Module, ModelProto],
    torch_inputset: Union[Tensor, ndarray, Tuple[Union[Tensor, ndarray], ]],
    import_qat: bool = False,
    n_bits: Union[int, Dict[str, int]] = 8,
    rounding_threshold_bits: Optional[int] = None,
    reduce_sum_copy=False
) → QuantizedModule

Build a quantized module from a Torch or ONNX model.

Take a model in torch or ONNX, turn it to numpy, quantize its inputs / weights / outputs and retrieve the associated quantized module.

Args:

model (Union[torch.nn.Module, onnx.ModelProto]): The model to quantize, either in torch or in ONNX.
torch_inputset (Dataset): the calibration input-set, can contain either torch tensors or numpy.ndarray
import_qat (bool): Flag to signal that the network being imported contains quantizers in in its computation graph and that Concrete ML should not re-quantize it
n_bits: the number of bits for the quantization
rounding_threshold_bits (int): if not None, every accumulators in the model are rounded down to the given bits of precision
reduce_sum_copy (bool): if the inputs of QuantizedReduceSum should be copied to avoid bit-width propagation

Returns:

QuantizedModule: The resulting QuantizedModule.

function `compile_torch_model`

compile_torch_model(
    torch_model: Module,
    torch_inputset: Union[Tensor, ndarray, Tuple[Union[Tensor, ndarray], ]],
    import_qat: bool = False,
    configuration: Optional[Configuration] = None,
    artifacts: Optional[DebugArtifacts] = None,
    show_mlir: bool = False,
    n_bits: Union[int, Dict[str, int]] = 8,
    rounding_threshold_bits: Optional[int] = None,
    p_error: Optional[float] = None,
    global_p_error: Optional[float] = None,
    verbose: bool = False,
    inputs_encryption_status: Optional[Sequence[str]] = None,
    reduce_sum_copy: bool = False
) → QuantizedModule

Compile a torch module into an FHE equivalent.

Take a model in torch, turn it to numpy, quantize its inputs / weights / outputs and finally compile it with Concrete

Args:

torch_model (torch.nn.Module): the model to quantize
torch_inputset (Dataset): the calibration input-set, can contain either torch tensors or numpy.ndarray.
import_qat (bool): Set to True to import a network that contains quantizers and was trained using quantization aware training
configuration (Configuration): Configuration object to use during compilation
artifacts (DebugArtifacts): Artifacts object to fill during compilation
show_mlir (bool): if set, the MLIR produced by the converter and which is going to be sent to the compiler backend is shown on the screen, e.g., for debugging or demo
n_bits (Union[int, Dict[str, int]]): number of bits for quantization, can be a single value or a dictionary with the following keys : - "op_inputs" and "op_weights" (mandatory) - "model_inputs" and "model_outputs" (optional, default to 5 bits). When using a single integer for n_bits, its value is assigned to "op_inputs" and "op_weights" bits. Default is 8 bits.
rounding_threshold_bits (int): if not None, every accumulators in the model are rounded down to the given bits of precision
p_error (Optional[float]): probability of error of a single PBS
global_p_error (Optional[float]): probability of error of the full circuit. In FHE simulation global_p_error is set to 0
verbose (bool): whether to show compilation information
inputs_encryption_status (Optional[Sequence[str]]): encryption status ('clear', 'encrypted') for each input. By default all arguments will be encrypted.
reduce_sum_copy (bool): if the inputs of QuantizedReduceSum should be copied to avoid bit-width propagation

Returns:

QuantizedModule: The resulting compiled QuantizedModule.

function `compile_onnx_model`

compile_onnx_model(
    onnx_model: ModelProto,
    torch_inputset: Union[Tensor, ndarray, Tuple[Union[Tensor, ndarray], ]],
    import_qat: bool = False,
    configuration: Optional[Configuration] = None,
    artifacts: Optional[DebugArtifacts] = None,
    show_mlir: bool = False,
    n_bits: Union[int, Dict[str, int]] = 8,
    rounding_threshold_bits: Optional[int] = None,
    p_error: Optional[float] = None,
    global_p_error: Optional[float] = None,
    verbose: bool = False,
    inputs_encryption_status: Optional[Sequence[str]] = None,
    reduce_sum_copy: bool = False
) → QuantizedModule

Compile a torch module into an FHE equivalent.

Take a model in torch, turn it to numpy, quantize its inputs / weights / outputs and finally compile it with Concrete-Python

Args:

onnx_model (onnx.ModelProto): the model to quantize
torch_inputset (Dataset): the calibration input-set, can contain either torch tensors or numpy.ndarray.
import_qat (bool): Flag to signal that the network being imported contains quantizers in in its computation graph and that Concrete ML should not re-quantize it.
configuration (Configuration): Configuration object to use during compilation
artifacts (DebugArtifacts): Artifacts object to fill during compilation
show_mlir (bool): if set, the MLIR produced by the converter and which is going to be sent to the compiler backend is shown on the screen, e.g., for debugging or demo
n_bits (Union[int, Dict[str, int]]): number of bits for quantization, can be a single value or a dictionary with the following keys : - "op_inputs" and "op_weights" (mandatory) - "model_inputs" and "model_outputs" (optional, default to 5 bits). When using a single integer for n_bits, its value is assigned to "op_inputs" and "op_weights" bits. Default is 8 bits.
rounding_threshold_bits (int): if not None, every accumulators in the model are rounded down to the given bits of precision
p_error (Optional[float]): probability of error of a single PBS
global_p_error (Optional[float]): probability of error of the full circuit. In FHE simulation global_p_error is set to 0
verbose (bool): whether to show compilation information
inputs_encryption_status (Optional[Sequence[str]]): encryption status ('clear', 'encrypted') for each input. By default all arguments will be encrypted.
reduce_sum_copy (bool): if the inputs of QuantizedReduceSum should be copied to avoid bit-width propagation

Returns:

QuantizedModule: The resulting compiled QuantizedModule.

function `compile_brevitas_qat_model`

compile_brevitas_qat_model(
    torch_model: Module,
    torch_inputset: Union[Tensor, ndarray, Tuple[Union[Tensor, ndarray], ]],
    n_bits: Optional[int, Dict[str, int]] = None,
    configuration: Optional[Configuration] = None,
    artifacts: Optional[DebugArtifacts] = None,
    show_mlir: bool = False,
    rounding_threshold_bits: Optional[int] = None,
    p_error: Optional[float] = None,
    global_p_error: Optional[float] = None,
    output_onnx_file: Union[NoneType, Path, str] = None,
    verbose: bool = False,
    inputs_encryption_status: Optional[Sequence[str]] = None,
    reduce_sum_copy: bool = False
) → QuantizedModule

Compile a Brevitas Quantization Aware Training model.

The torch_model parameter is a subclass of torch.nn.Module that uses quantized operations from brevitas.qnn. The model is trained before calling this function. This function compiles the trained model to FHE.

Args:

torch_model (torch.nn.Module): the model to quantize
torch_inputset (Dataset): the calibration input-set, can contain either torch tensors or numpy.ndarray.
n_bits (Optional[Union[int, dict]): the number of bits for the quantization. By default, for most models, a value of None should be given, which instructs Concrete ML to use the bit-widths configured using Brevitas quantization options. For some networks, that perform a non-linear operation on an input on an output, if None is given, a default value of 8 bits is used for the input/output quantization. For such models the user can also specify a dictionary with model_inputs/model_outputs keys to override the 8-bit default or a single integer for both values.
configuration (Configuration): Configuration object to use during compilation
artifacts (DebugArtifacts): Artifacts object to fill during compilation
show_mlir (bool): if set, the MLIR produced by the converter and which is going to be sent to the compiler backend is shown on the screen, e.g., for debugging or demo
rounding_threshold_bits (int): if not None, every accumulators in the model are rounded down to the given bits of precision
p_error (Optional[float]): probability of error of a single PBS
global_p_error (Optional[float]): probability of error of the full circuit. In FHE simulation global_p_error is set to 0
output_onnx_file (str): temporary file to store ONNX model. If None a temporary file is generated
verbose (bool): whether to show compilation information
inputs_encryption_status (Optional[Sequence[str]]): encryption status ('clear', 'encrypted') for each input. By default all arguments will be encrypted.
reduce_sum_copy (bool): if the inputs of QuantizedReduceSum should be copied to avoid bit-width propagation

Returns:

QuantizedModule: The resulting compiled QuantizedModule.

concrete.ml.torch.compile.md

module `concrete.ml.torch.compile`

torch compilation function.

Global Variables

MAX_BITWIDTH_BACKWARD_COMPATIBLE
OPSET_VERSION_FOR_ONNX_EXPORT

function `has_any_qnn_layers`

has_any_qnn_layers(torch_model: Module) → bool

Check if a torch model has QNN layers.

This is useful to check if a model is a QAT model.

Args:

torch_model (torch.nn.Module): a torch model

Returns:

bool: whether this torch model contains any QNN layer.

function `convert_torch_tensor_or_numpy_array_to_numpy_array`

convert_torch_tensor_or_numpy_array_to_numpy_array(
    torch_tensor_or_numpy_array: Union[Tensor, ndarray]
) → ndarray

Convert a torch tensor or a numpy array to a numpy array.

Args:

torch_tensor_or_numpy_array (Tensor): the value that is either a torch tensor or a numpy array.

Returns:

numpy.ndarray: the value converted to a numpy array.

function `build_quantized_module`

build_quantized_module(
    model: Union[Module, ModelProto],
    torch_inputset: Union[Tensor, ndarray, Tuple[Union[Tensor, ndarray], ]],
    import_qat: bool = False,
    n_bits: Union[int, Dict[str, int]] = 8,
    rounding_threshold_bits: Optional[int] = None,
    reduce_sum_copy=False
) → QuantizedModule

Build a quantized module from a Torch or ONNX model.

Take a model in torch or ONNX, turn it to numpy, quantize its inputs / weights / outputs and retrieve the associated quantized module.

Args:

model (Union[torch.nn.Module, onnx.ModelProto]): The model to quantize, either in torch or in ONNX.
torch_inputset (Dataset): the calibration input-set, can contain either torch tensors or numpy.ndarray
import_qat (bool): Flag to signal that the network being imported contains quantizers in in its computation graph and that Concrete ML should not re-quantize it
n_bits: the number of bits for the quantization
rounding_threshold_bits (int): if not None, every accumulators in the model are rounded down to the given bits of precision
reduce_sum_copy (bool): if the inputs of QuantizedReduceSum should be copied to avoid bit-width propagation

Returns:

QuantizedModule: The resulting QuantizedModule.

function `compile_torch_model`

compile_torch_model(
    torch_model: Module,
    torch_inputset: Union[Tensor, ndarray, Tuple[Union[Tensor, ndarray], ]],
    import_qat: bool = False,
    configuration: Optional[Configuration] = None,
    artifacts: Optional[DebugArtifacts] = None,
    show_mlir: bool = False,
    n_bits: Union[int, Dict[str, int]] = 8,
    rounding_threshold_bits: Optional[int] = None,
    p_error: Optional[float] = None,
    global_p_error: Optional[float] = None,
    verbose: bool = False,
    inputs_encryption_status: Optional[Sequence[str]] = None,
    reduce_sum_copy: bool = False
) → QuantizedModule

Compile a torch module into an FHE equivalent.

Take a model in torch, turn it to numpy, quantize its inputs / weights / outputs and finally compile it with Concrete

Args:

torch_model (torch.nn.Module): the model to quantize
torch_inputset (Dataset): the calibration input-set, can contain either torch tensors or numpy.ndarray.
import_qat (bool): Set to True to import a network that contains quantizers and was trained using quantization aware training
configuration (Configuration): Configuration object to use during compilation
artifacts (DebugArtifacts): Artifacts object to fill during compilation
show_mlir (bool): if set, the MLIR produced by the converter and which is going to be sent to the compiler backend is shown on the screen, e.g., for debugging or demo
n_bits (Union[int, Dict[str, int]]): number of bits for quantization, can be a single value or a dictionary with the following keys : - "op_inputs" and "op_weights" (mandatory) - "model_inputs" and "model_outputs" (optional, default to 5 bits). When using a single integer for n_bits, its value is assigned to "op_inputs" and "op_weights" bits. Default is 8 bits.
rounding_threshold_bits (int): if not None, every accumulators in the model are rounded down to the given bits of precision
p_error (Optional[float]): probability of error of a single PBS
global_p_error (Optional[float]): probability of error of the full circuit. In FHE simulation global_p_error is set to 0
verbose (bool): whether to show compilation information
inputs_encryption_status (Optional[Sequence[str]]): encryption status ('clear', 'encrypted') for each input. By default all arguments will be encrypted.
reduce_sum_copy (bool): if the inputs of QuantizedReduceSum should be copied to avoid bit-width propagation

Returns:

QuantizedModule: The resulting compiled QuantizedModule.

function `compile_onnx_model`

compile_onnx_model(
    onnx_model: ModelProto,
    torch_inputset: Union[Tensor, ndarray, Tuple[Union[Tensor, ndarray], ]],
    import_qat: bool = False,
    configuration: Optional[Configuration] = None,
    artifacts: Optional[DebugArtifacts] = None,
    show_mlir: bool = False,
    n_bits: Union[int, Dict[str, int]] = 8,
    rounding_threshold_bits: Optional[int] = None,
    p_error: Optional[float] = None,
    global_p_error: Optional[float] = None,
    verbose: bool = False,
    inputs_encryption_status: Optional[Sequence[str]] = None,
    reduce_sum_copy: bool = False
) → QuantizedModule

Compile a torch module into an FHE equivalent.

Take a model in torch, turn it to numpy, quantize its inputs / weights / outputs and finally compile it with Concrete-Python

Args:

onnx_model (onnx.ModelProto): the model to quantize
torch_inputset (Dataset): the calibration input-set, can contain either torch tensors or numpy.ndarray.
import_qat (bool): Flag to signal that the network being imported contains quantizers in in its computation graph and that Concrete ML should not re-quantize it.
configuration (Configuration): Configuration object to use during compilation
artifacts (DebugArtifacts): Artifacts object to fill during compilation
show_mlir (bool): if set, the MLIR produced by the converter and which is going to be sent to the compiler backend is shown on the screen, e.g., for debugging or demo
n_bits (Union[int, Dict[str, int]]): number of bits for quantization, can be a single value or a dictionary with the following keys : - "op_inputs" and "op_weights" (mandatory) - "model_inputs" and "model_outputs" (optional, default to 5 bits). When using a single integer for n_bits, its value is assigned to "op_inputs" and "op_weights" bits. Default is 8 bits.
rounding_threshold_bits (int): if not None, every accumulators in the model are rounded down to the given bits of precision
p_error (Optional[float]): probability of error of a single PBS
global_p_error (Optional[float]): probability of error of the full circuit. In FHE simulation global_p_error is set to 0
verbose (bool): whether to show compilation information
inputs_encryption_status (Optional[Sequence[str]]): encryption status ('clear', 'encrypted') for each input. By default all arguments will be encrypted.
reduce_sum_copy (bool): if the inputs of QuantizedReduceSum should be copied to avoid bit-width propagation

Returns:

QuantizedModule: The resulting compiled QuantizedModule.

function `compile_brevitas_qat_model`

compile_brevitas_qat_model(
    torch_model: Module,
    torch_inputset: Union[Tensor, ndarray, Tuple[Union[Tensor, ndarray], ]],
    n_bits: Optional[int, Dict[str, int]] = None,
    configuration: Optional[Configuration] = None,
    artifacts: Optional[DebugArtifacts] = None,
    show_mlir: bool = False,
    rounding_threshold_bits: Optional[int] = None,
    p_error: Optional[float] = None,
    global_p_error: Optional[float] = None,
    output_onnx_file: Union[NoneType, Path, str] = None,
    verbose: bool = False,
    inputs_encryption_status: Optional[Sequence[str]] = None,
    reduce_sum_copy: bool = False
) → QuantizedModule

Compile a Brevitas Quantization Aware Training model.

Args:

torch_model (torch.nn.Module): the model to quantize
torch_inputset (Dataset): the calibration input-set, can contain either torch tensors or numpy.ndarray.
n_bits (Optional[Union[int, dict]): the number of bits for the quantization. By default, for most models, a value of None should be given, which instructs Concrete ML to use the bit-widths configured using Brevitas quantization options. For some networks, that perform a non-linear operation on an input on an output, if None is given, a default value of 8 bits is used for the input/output quantization. For such models the user can also specify a dictionary with model_inputs/model_outputs keys to override the 8-bit default or a single integer for both values.
configuration (Configuration): Configuration object to use during compilation
artifacts (DebugArtifacts): Artifacts object to fill during compilation
show_mlir (bool): if set, the MLIR produced by the converter and which is going to be sent to the compiler backend is shown on the screen, e.g., for debugging or demo
rounding_threshold_bits (int): if not None, every accumulators in the model are rounded down to the given bits of precision
p_error (Optional[float]): probability of error of a single PBS
global_p_error (Optional[float]): probability of error of the full circuit. In FHE simulation global_p_error is set to 0
output_onnx_file (str): temporary file to store ONNX model. If None a temporary file is generated
verbose (bool): whether to show compilation information
inputs_encryption_status (Optional[Sequence[str]]): encryption status ('clear', 'encrypted') for each input. By default all arguments will be encrypted.
reduce_sum_copy (bool): if the inputs of QuantizedReduceSum should be copied to avoid bit-width propagation

Returns:

QuantizedModule: The resulting compiled QuantizedModule.

concrete.ml.torch.compile.md

module concrete.ml.torch.compile

Global Variables

function has_any_qnn_layers

function convert_torch_tensor_or_numpy_array_to_numpy_array

function build_quantized_module

function compile_torch_model

function compile_onnx_model

function compile_brevitas_qat_model

concrete.ml.torch.compile.md

module concrete.ml.torch.compile

Global Variables

function has_any_qnn_layers

function convert_torch_tensor_or_numpy_array_to_numpy_array

function build_quantized_module

function compile_torch_model

function compile_onnx_model

function compile_brevitas_qat_model

module `concrete.ml.torch.compile`

function `has_any_qnn_layers`

function `convert_torch_tensor_or_numpy_array_to_numpy_array`

function `build_quantized_module`

function `compile_torch_model`

function `compile_onnx_model`

function `compile_brevitas_qat_model`

module `concrete.ml.torch.compile`

function `has_any_qnn_layers`

function `convert_torch_tensor_or_numpy_array_to_numpy_array`

function `build_quantized_module`

function `compile_torch_model`

function `compile_onnx_model`

function `compile_brevitas_qat_model`