concrete.ml.quantization.quantized_module
QuantizedModule API.
SUPPORTED_FLOAT_TYPES
SUPPORTED_INT_TYPES
QuantizedModule
Inference for a quantized model.
__init__
property is_compiled
Indicate if the model is compiled.
Returns:
bool
: If the model is compiled.
property onnx_model
Get the ONNX model.
.. # noqa: DAR201
Returns:
_onnx_model
(onnx.ModelProto): the ONNX model
property post_processing_params
Get the post-processing parameters.
Returns:
Dict[str, Any]
: the post-processing parameters
bitwidth_and_range_report
Report the ranges and bit-widths for layers that mix encrypted integer values.
Returns:
op_names_to_report
(Dict): a dictionary with operation names as keys. For each operation, (e.g., conv/gemm/add/avgpool ops), a range and a bit-width are returned. The range contains the min/max values encountered when computing the operation and the bit-width gives the number of bits needed to represent this range.
check_model_is_compiled
Check if the quantized module is compiled.
Raises:
AttributeError
: If the quantized module is not compiled.
compile
Compile the module's forward function.
Args:
inputs
(numpy.ndarray): A representative set of input values used for building cryptographic parameters.
configuration
(Optional[Configuration]): Options to use for compilation. Default to None.
artifacts
(Optional[DebugArtifacts]): Artifacts information about the compilation process to store for debugging.
show_mlir
(bool): Indicate if the MLIR graph should be printed during compilation.
p_error
(Optional[float]): Probability of error of a single PBS. A p_error value cannot be given if a global_p_error value is already set. Default to None, which sets this error to a default value.
global_p_error
(Optional[float]): Probability of error of the full circuit. A global_p_error value cannot be given if a p_error value is already set. This feature is not supported during simulation, meaning the probability is currently set to 0. Default to None, which sets this error to a default value.
verbose
(bool): Indicate if compilation information should be printed during compilation. Default to False.
Returns:
Circuit
: The compiled Circuit.
dequantize_output
Take the last layer q_out and use its de-quant function.
Args:
q_y_preds
(numpy.ndarray): Quantized output values of the last layer.
Returns:
numpy.ndarray
: De-quantized output values of the last layer.
dump
Dump itself to a file.
Args:
file
(TextIO): The file to dump the serialized object into.
dump_dict
Dump itself to a dict.
Returns:
metadata
(Dict): Dict of serialized objects.
dumps
Dump itself to a string.
Returns:
metadata
(str): String of the serialized object.
forward
Forward pass with numpy function only on floating points.
This method executes the forward pass in the clear, with simulation or in FHE. Input values are expected to be floating points, as the method handles the quantization step. The returned values are floating points as well.
Args:
*x (numpy.ndarray)
: Input float values to consider.
fhe
(Union[FheMode, str]): The mode to use for prediction. Can be FheMode.DISABLE for Concrete ML Python inference, FheMode.SIMULATE for FHE simulation and FheMode.EXECUTE for actual FHE execution. Can also be the string representation of any of these values. Default to FheMode.DISABLE.
debug
(bool): In debug mode, returns quantized intermediary values of the computation. This is useful when a model's intermediary values in Concrete ML need to be compared with the intermediary values obtained in pytorch/onnx. When set, the second return value is a dictionary containing ONNX operation names as keys and, as values, their input QuantizedArray or ndarray. The use can thus extract the quantized or float values of quantized inputs. This feature is only available in FheMode.DISABLE mode. Default to False.
Returns:
numpy.ndarray
: Predictions of the quantized model, in floating points.
load_dict
Load itself from a string.
Args:
metadata
(Dict): Dict of serialized objects.
Returns:
QuantizedModule
: The loaded object.
post_processing
Apply post-processing to the de-quantized values.
For quantized modules, there is no post-processing step but the method is kept to make the API consistent for the client-server API.
Args:
values
(numpy.ndarray): The de-quantized values to post-process.
Returns:
numpy.ndarray
: The post-processed values.
quantize_input
Take the inputs in fp32 and quantize it using the learned quantization parameters.
Args:
x
(numpy.ndarray): Floating point x.
Returns:
Union[numpy.ndarray, Tuple[numpy.ndarray, ...]]
: Quantized (numpy.int64) x.
quantized_forward
Forward function for the FHE circuit.
Args:
*q_x (numpy.ndarray)
: Input integer values to consider.
fhe
(Union[FheMode, str]): The mode to use for prediction. Can be FheMode.DISABLE for Concrete ML Python inference, FheMode.SIMULATE for FHE simulation and FheMode.EXECUTE for actual FHE execution. Can also be the string representation of any of these values. Default to FheMode.DISABLE.
Returns:
(numpy.ndarray)
: Predictions of the quantized model, with integer values.
set_inputs_quantization_parameters
Set the quantization parameters for the module's inputs.
Args:
*input_q_params (UniformQuantizer)
: The quantizer(s) for the module.