concrete.ml.sklearn.base.md

module concrete.ml.sklearn.base

Module that contains base classes for our libraries estimators.

Global Variables

  • OPSET_VERSION_FOR_ONNX_EXPORT


function get_sklearn_models

get_sklearn_models()

Return the list of available models in Concrete-ML.

Returns: the lists of models in Concrete-ML


function get_sklearn_linear_models

get_sklearn_linear_models(
    classifier: bool = True,
    regressor: bool = True,
    str_in_class_name: str = None
)

Return the list of available linear models in Concrete-ML.

Args:

  • classifier (bool): whether you want classifiers or not

  • regressor (bool): whether you want regressors or not

  • str_in_class_name (str): if not None, only return models with this as a substring in the class name

Returns: the lists of linear models in Concrete-ML


function get_sklearn_tree_models

get_sklearn_tree_models(
    classifier: bool = True,
    regressor: bool = True,
    str_in_class_name: str = None
)

Return the list of available tree models in Concrete-ML.

Args:

  • classifier (bool): whether you want classifiers or not

  • regressor (bool): whether you want regressors or not

  • str_in_class_name (str): if not None, only return models with this as a substring in the class name

Returns: the lists of tree models in Concrete-ML


function get_sklearn_neural_net_models

get_sklearn_neural_net_models(
    classifier: bool = True,
    regressor: bool = True,
    str_in_class_name: str = None
)

Return the list of available neural net models in Concrete-ML.

Args:

  • classifier (bool): whether you want classifiers or not

  • regressor (bool): whether you want regressors or not

  • str_in_class_name (str): if not None, only return models with this as a substring in the class name

Returns: the lists of neural net models in Concrete-ML


class QuantizedTorchEstimatorMixin

Mixin that provides quantization for a torch module and follows the Estimator API.

This class should be mixed in with another that provides the full Estimator API. This class only provides modifiers for .fit() (with quantization) and .predict() (optionally in FHE)

method __init__

__init__()

property base_estimator_type

Get the sklearn estimator that should be trained by the child class.


property base_module_to_compile

Get the Torch module that should be compiled to FHE.


property fhe_circuit

Get the FHE circuit.

Returns:

  • Circuit: the FHE circuit


property input_quantizers

Get the input quantizers.

Returns:

  • List[Quantizer]: the input quantizers


property n_bits_quant

Get the number of quantization bits.


property onnx_model

Get the ONNX model.

.. # noqa: DAR201

Returns:

  • _onnx_model_ (onnx.ModelProto): the ONNX model


property output_quantizers

Get the input quantizers.

Returns:

  • List[QuantizedArray]: the input quantizers


property quantize_input

Get the input quantization function.

Returns:

  • Callable : function that quantizes the input


method compile

compile(
    X: ndarray,
    configuration: Optional[Configuration] = None,
    compilation_artifacts: Optional[DebugArtifacts] = None,
    show_mlir: bool = False,
    use_virtual_lib: bool = False,
    p_error: Optional[float] = None,
    global_p_error: Optional[float] = None,
    verbose_compilation: bool = False
) → Circuit

Compile the model.

Args:

  • X (numpy.ndarray): the dequantized dataset

  • configuration (Optional[Configuration]): the options for compilation

  • compilation_artifacts (Optional[DebugArtifacts]): artifacts object to fill during compilation

  • show_mlir (bool): whether or not to show MLIR during the compilation

  • use_virtual_lib (bool): whether to compile using the virtual library that allows higher bitwidths

  • p_error (Optional[float]): probability of error of a single PBS

  • global_p_error (Optional[float]): probability of error of the full circuit. Not simulated by the VL, i.e., taken as 0

  • verbose_compilation (bool): whether to show compilation information

Returns:

  • Circuit: the compiled Circuit.

Raises:

  • ValueError: if called before the model is trained


method fit

fit(X, y, **fit_params)

Initialize and fit the module.

If the module was already initialized, by calling fit, the module will be re-initialized (unless warm_start is True). In addition to the torch training step, this method performs quantization of the trained torch model.

Args:

  • X : training data By default, you should be able to pass: * numpy arrays * torch tensors * pandas DataFrame or Series

  • y (numpy.ndarray): labels associated with training data

  • **fit_params: additional parameters that can be used during training, these are passed to the torch training interface

Returns:

  • self: the trained quantized estimator


method fit_benchmark

fit_benchmark(X: ndarray, y: ndarray, *args, **kwargs) → Tuple[Any, Any]

Fit the quantized estimator as well as its equivalent float estimator.

This function returns both the quantized estimator (itself) as well as its non-quantized (float) equivalent, which are both trained separately. This is useful in order to compare performances between quantized and fp32 versions.

Args:

  • X : The training data By default, you should be able to pass: * numpy arrays * torch tensors * pandas DataFrame or Series

  • y (numpy.ndarray): The labels associated with the training data

  • *args: The arguments to pass to the sklearn linear model.

  • **kwargs: The keyword arguments to pass to the sklearn linear model.

Returns:

  • self: The trained quantized estimator

  • fp32_model: The trained float equivalent estimator


method get_params_for_benchmark

get_params_for_benchmark()

Get the parameters to instantiate the sklearn estimator trained by the child class.

Returns:

  • params (dict): dictionary with parameters that will initialize a new Estimator


method post_processing

post_processing(y_preds: ndarray) → ndarray

Post-processing the output.

Args:

  • y_preds (numpy.ndarray): the output to post-process

Raises:

  • ValueError: if unknown post-processing function

Returns:

  • numpy.ndarray: the post-processed output


method predict

predict(X, execute_in_fhe=False)

Predict on user provided data.

Predicts using the quantized clear or FHE classifier

Args:

  • X : input data, a numpy array of raw values (non quantized)

  • execute_in_fhe : whether to execute the inference in FHE or in the clear

Returns:

  • y_pred : numpy ndarray with predictions


method predict_proba

predict_proba(X, execute_in_fhe=False)

Predict on user provided data, returning probabilities.

Predicts using the quantized clear or FHE classifier

Args:

  • X : input data, a numpy array of raw values (non quantized)

  • execute_in_fhe : whether to execute the inference in FHE or in the clear

Returns:

  • y_pred : numpy ndarray with probabilities (if applicable)

Raises:

  • ValueError: if the estimator was not yet trained or compiled


class BaseTreeEstimatorMixin

Mixin class for tree-based estimators.

A place to share methods that are used on all tree-based estimators.

method __init__

__init__(n_bits: int)

Initialize the TreeBasedEstimatorMixin.

Args:

  • n_bits (int): number of bits used for quantization


property onnx_model

Get the ONNX model.

.. # noqa: DAR201

Returns:

  • onnx.ModelProto: the ONNX model


method compile

compile(
    X: ndarray,
    configuration: Optional[Configuration] = None,
    compilation_artifacts: Optional[DebugArtifacts] = None,
    show_mlir: bool = False,
    use_virtual_lib: bool = False,
    p_error: Optional[float] = None,
    global_p_error: Optional[float] = None,
    verbose_compilation: bool = False
) → Circuit

Compile the model.

Args:

  • X (numpy.ndarray): the dequantized dataset

  • configuration (Optional[Configuration]): the options for compilation

  • compilation_artifacts (Optional[DebugArtifacts]): artifacts object to fill during compilation

  • show_mlir (bool): whether or not to show MLIR during the compilation

  • use_virtual_lib (bool): set to True to use the so called virtual lib simulating FHE computation. Defaults to False

  • p_error (Optional[float]): probability of error of a single PBS

  • global_p_error (Optional[float]): probability of error of the full circuit. Not simulated by the VL, i.e., taken as 0

  • verbose_compilation (bool): whether to show compilation information

Returns:

  • Circuit: the compiled Circuit.


method dequantize_output

dequantize_output(y_preds: ndarray)

Dequantize the integer predictions.

Args:

  • y_preds (numpy.ndarray): the predictions

Returns: the dequantized predictions


method fit_benchmark

fit_benchmark(
    X: ndarray,
    y: ndarray,
    *args,
    random_state: Optional[int] = None,
    **kwargs
) → Tuple[Any, Any]

Fit the sklearn tree-based model and the FHE tree-based model.

Args:

  • X (numpy.ndarray): The input data.

  • y (numpy.ndarray): The target data. random_state (Optional[Union[int, numpy.random.RandomState, None]]): The random state. Defaults to None.

  • *args: args for super().fit

  • **kwargs: kwargs for super().fit

Returns: Tuple[ConcreteEstimators, SklearnEstimators]: The FHE and sklearn tree-based models.


method quantize_input

quantize_input(X: ndarray)

Quantize the input.

Args:

  • X (numpy.ndarray): the input

Returns: the quantized input


class BaseTreeRegressorMixin

Mixin class for tree-based regressors.

A place to share methods that are used on all tree-based regressors.

method __init__

__init__(n_bits: int)

Initialize the TreeBasedEstimatorMixin.

Args:

  • n_bits (int): number of bits used for quantization


property onnx_model

Get the ONNX model.

.. # noqa: DAR201

Returns:

  • onnx.ModelProto: the ONNX model


method compile

compile(
    X: ndarray,
    configuration: Optional[Configuration] = None,
    compilation_artifacts: Optional[DebugArtifacts] = None,
    show_mlir: bool = False,
    use_virtual_lib: bool = False,
    p_error: Optional[float] = None,
    global_p_error: Optional[float] = None,
    verbose_compilation: bool = False
) → Circuit

Compile the model.

Args:

  • X (numpy.ndarray): the dequantized dataset

  • configuration (Optional[Configuration]): the options for compilation

  • compilation_artifacts (Optional[DebugArtifacts]): artifacts object to fill during compilation

  • show_mlir (bool): whether or not to show MLIR during the compilation

  • use_virtual_lib (bool): set to True to use the so called virtual lib simulating FHE computation. Defaults to False

  • p_error (Optional[float]): probability of error of a single PBS

  • global_p_error (Optional[float]): probability of error of the full circuit. Not simulated by the VL, i.e., taken as 0

  • verbose_compilation (bool): whether to show compilation information

Returns:

  • Circuit: the compiled Circuit.


method dequantize_output

dequantize_output(y_preds: ndarray)

Dequantize the integer predictions.

Args:

  • y_preds (numpy.ndarray): the predictions

Returns: the dequantized predictions


method fit

fit(X, y: ndarray, **kwargs) → Any

Fit the tree-based estimator.

Args:

  • X : training data By default, you should be able to pass: * numpy arrays * torch tensors * pandas DataFrame or Series

  • y (numpy.ndarray): The target data.

  • **kwargs: args for super().fit

Returns:

  • Any: The fitted model.


method fit_benchmark

fit_benchmark(
    X: ndarray,
    y: ndarray,
    *args,
    random_state: Optional[int] = None,
    **kwargs
) → Tuple[Any, Any]

Fit the sklearn tree-based model and the FHE tree-based model.

Args:

  • X (numpy.ndarray): The input data.

  • y (numpy.ndarray): The target data. random_state (Optional[Union[int, numpy.random.RandomState, None]]): The random state. Defaults to None.

  • *args: args for super().fit

  • **kwargs: kwargs for super().fit

Returns: Tuple[ConcreteEstimators, SklearnEstimators]: The FHE and sklearn tree-based models.


method post_processing

post_processing(y_preds: ndarray) → ndarray

Apply post-processing to the predictions.

Args:

  • y_preds (numpy.ndarray): The predictions.

Returns:

  • numpy.ndarray: The post-processed predictions.


method predict

predict(X: ndarray, execute_in_fhe: bool = False) → ndarray

Predict the probability.

Args:

  • X (numpy.ndarray): The input data.

  • execute_in_fhe (bool): Whether to execute in FHE. Defaults to False.

Returns:

  • numpy.ndarray: The predicted probabilities.


method quantize_input

quantize_input(X: ndarray)

Quantize the input.

Args:

  • X (numpy.ndarray): the input

Returns: the quantized input


class BaseTreeClassifierMixin

Mixin class for tree-based classifiers.

A place to share methods that are used on all tree-based classifiers.

method __init__

__init__(n_bits: int)

Initialize the TreeBasedEstimatorMixin.

Args:

  • n_bits (int): number of bits used for quantization


property onnx_model

Get the ONNX model.

.. # noqa: DAR201

Returns:

  • onnx.ModelProto: the ONNX model


method compile

compile(
    X: ndarray,
    configuration: Optional[Configuration] = None,
    compilation_artifacts: Optional[DebugArtifacts] = None,
    show_mlir: bool = False,
    use_virtual_lib: bool = False,
    p_error: Optional[float] = None,
    global_p_error: Optional[float] = None,
    verbose_compilation: bool = False
) → Circuit

Compile the model.

Args:

  • X (numpy.ndarray): the dequantized dataset

  • configuration (Optional[Configuration]): the options for compilation

  • compilation_artifacts (Optional[DebugArtifacts]): artifacts object to fill during compilation

  • show_mlir (bool): whether or not to show MLIR during the compilation

  • use_virtual_lib (bool): set to True to use the so called virtual lib simulating FHE computation. Defaults to False

  • p_error (Optional[float]): probability of error of a single PBS

  • global_p_error (Optional[float]): probability of error of the full circuit. Not simulated by the VL, i.e., taken as 0

  • verbose_compilation (bool): whether to show compilation information

Returns:

  • Circuit: the compiled Circuit.


method dequantize_output

dequantize_output(y_preds: ndarray)

Dequantize the integer predictions.

Args:

  • y_preds (numpy.ndarray): the predictions

Returns: the dequantized predictions


method fit

fit(X, y: ndarray, **kwargs) → Any

Fit the tree-based estimator.

Args:

  • X : training data By default, you should be able to pass: * numpy arrays * torch tensors * pandas DataFrame or Series

  • y (numpy.ndarray): The target data.

  • **kwargs: args for super().fit

Returns:

  • Any: The fitted model.


method fit_benchmark

fit_benchmark(
    X: ndarray,
    y: ndarray,
    *args,
    random_state: Optional[int] = None,
    **kwargs
) → Tuple[Any, Any]

Fit the sklearn tree-based model and the FHE tree-based model.

Args:

  • X (numpy.ndarray): The input data.

  • y (numpy.ndarray): The target data. random_state (Optional[Union[int, numpy.random.RandomState, None]]): The random state. Defaults to None.

  • *args: args for super().fit

  • **kwargs: kwargs for super().fit

Returns: Tuple[ConcreteEstimators, SklearnEstimators]: The FHE and sklearn tree-based models.


method post_processing

post_processing(y_preds: ndarray) → ndarray

Apply post-processing to the predictions.

Args:

  • y_preds (numpy.ndarray): The predictions.

Returns:

  • numpy.ndarray: The post-processed predictions.


method predict

predict(X: ndarray, execute_in_fhe: bool = False) → ndarray

Predict the class with highest probability.

Args:

  • X (numpy.ndarray): The input data.

  • execute_in_fhe (bool): Whether to execute in FHE. Defaults to False.

Returns:

  • numpy.ndarray: The predicted target values.


method predict_proba

predict_proba(X: ndarray, execute_in_fhe: bool = False) → ndarray

Predict the probability.

Args:

  • X (numpy.ndarray): The input data.

  • execute_in_fhe (bool): Whether to execute in FHE. Defaults to False.

Returns:

  • numpy.ndarray: The predicted probabilities.


method quantize_input

quantize_input(X: ndarray)

Quantize the input.

Args:

  • X (numpy.ndarray): the input

Returns: the quantized input


class SklearnLinearModelMixin

A Mixin class for sklearn linear models with FHE.

method __init__

__init__(*args, n_bits: Union[int, Dict[str, int]] = 8, **kwargs)

Initialize the FHE linear model.

Args:

  • n_bits (int, Dict[str, int]): Number of bits to quantize the model. If an int is passed for n_bits, the value will be used for quantizing inputs and weights. If a dict is passed, then it should contain "op_inputs" and "op_weights" as keys with corresponding number of quantization bits so that: - op_inputs : number of bits to quantize the input values - op_weights: number of bits to quantize the learned parameters Default to 8.

  • *args: The arguments to pass to the sklearn linear model.

  • **kwargs: The keyword arguments to pass to the sklearn linear model.


method clean_graph

clean_graph()

Clean the graph of the onnx model.

This will remove the Cast node in the model's onnx.graph since they have no use in quantized or FHE models.


method compile

compile(
    X: ndarray,
    configuration: Optional[Configuration] = None,
    compilation_artifacts: Optional[DebugArtifacts] = None,
    show_mlir: bool = False,
    use_virtual_lib: bool = False,
    p_error: Optional[float] = None,
    global_p_error: Optional[float] = None,
    verbose_compilation: bool = False
) → Circuit

Compile the FHE linear model.

Args:

  • X (numpy.ndarray): The input data.

  • configuration (Optional[Configuration]): Configuration object to use during compilation

  • compilation_artifacts (Optional[DebugArtifacts]): Artifacts object to fill during compilation

  • show_mlir (bool): If set, the MLIR produced by the converter and which is going to be sent to the compiler backend is shown on the screen, e.g., for debugging or demo. Defaults to False.

  • use_virtual_lib (bool): Whether to compile using the virtual library that allows higher bitwidths with simulated FHE computation. Defaults to False

  • p_error (Optional[float]): Probability of error of a single PBS

  • global_p_error (Optional[float]): probability of error of the full circuit. Not simulated by the VL, i.e., taken as 0

  • verbose_compilation (bool): whether to show compilation information

Returns:

  • Circuit: The compiled Circuit.


method dequantize_output

dequantize_output(q_y_preds: ndarray) → ndarray

Dequantize the output.

Args:

  • q_y_preds (numpy.ndarray): The quantized output to dequantize

Returns:

  • numpy.ndarray: The dequantized output


method fit

fit(X, y: ndarray, *args, **kwargs) → Any

Fit the FHE linear model.

Args:

  • X : Training data By default, you should be able to pass: * numpy arrays * torch tensors * pandas DataFrame or Series

  • y (numpy.ndarray): The target data.

  • *args: The arguments to pass to the sklearn linear model.

  • **kwargs: The keyword arguments to pass to the sklearn linear model.

Returns: Any


method fit_benchmark

fit_benchmark(
    X: ndarray,
    y: ndarray,
    *args,
    random_state: Optional[int] = None,
    **kwargs
) → Tuple[Any, Any]

Fit the sklearn linear model and the FHE linear model.

Args:

  • X (numpy.ndarray): The input data.

  • y (numpy.ndarray): The target data. random_state (Optional[Union[int, numpy.random.RandomState, None]]): The random state. Defaults to None.

  • *args: The arguments to pass to the sklearn linear model. or not (False). Default to False.

  • *args: Arguments for super().fit

  • **kwargs: Keyword arguments for super().fit

Returns: Tuple[SklearnLinearModelMixin, sklearn.linear_model.LinearRegression]: The FHE and sklearn LinearRegression.


method post_processing

post_processing(y_preds: ndarray) → ndarray

Post-processing the quantized output.

For linear models, post-processing only considers a dequantization step.

Args:

  • y_preds (numpy.ndarray): The quantized outputs to post-process

Returns:

  • numpy.ndarray: The post-processed output


method predict

predict(X: ndarray, execute_in_fhe: bool = False) → ndarray

Predict on user data.

Predict on user data using either the quantized clear model, implemented with tensors, or, if execute_in_fhe is set, using the compiled FHE circuit

Args:

  • X (numpy.ndarray): The input data

  • execute_in_fhe (bool): Whether to execute the inference in FHE

Returns:

  • numpy.ndarray: The prediction as ordinals


method quantize_input

quantize_input(X: ndarray)

Quantize the input.

Args:

  • X (numpy.ndarray): The input to quantize

Returns:

  • numpy.ndarray: The quantized input


class SklearnLinearClassifierMixin

A Mixin class for sklearn linear classifiers with FHE.

method __init__

__init__(*args, n_bits: Union[int, Dict[str, int]] = 8, **kwargs)

Initialize the FHE linear model.

Args:

  • n_bits (int, Dict[str, int]): Number of bits to quantize the model. If an int is passed for n_bits, the value will be used for quantizing inputs and weights. If a dict is passed, then it should contain "op_inputs" and "op_weights" as keys with corresponding number of quantization bits so that: - op_inputs : number of bits to quantize the input values - op_weights: number of bits to quantize the learned parameters Default to 8.

  • *args: The arguments to pass to the sklearn linear model.

  • **kwargs: The keyword arguments to pass to the sklearn linear model.


method clean_graph

clean_graph()

Clean the graph of the onnx model.

Any operators following gemm, including the sigmoid, softmax and argmax operators, are removed from the graph. They will be executed in clear in the post-processing method.


method compile

compile(
    X: ndarray,
    configuration: Optional[Configuration] = None,
    compilation_artifacts: Optional[DebugArtifacts] = None,
    show_mlir: bool = False,
    use_virtual_lib: bool = False,
    p_error: Optional[float] = None,
    global_p_error: Optional[float] = None,
    verbose_compilation: bool = False
) → Circuit

Compile the FHE linear model.

Args:

  • X (numpy.ndarray): The input data.

  • configuration (Optional[Configuration]): Configuration object to use during compilation

  • compilation_artifacts (Optional[DebugArtifacts]): Artifacts object to fill during compilation

  • show_mlir (bool): If set, the MLIR produced by the converter and which is going to be sent to the compiler backend is shown on the screen, e.g., for debugging or demo. Defaults to False.

  • use_virtual_lib (bool): Whether to compile using the virtual library that allows higher bitwidths with simulated FHE computation. Defaults to False

  • p_error (Optional[float]): Probability of error of a single PBS

  • global_p_error (Optional[float]): probability of error of the full circuit. Not simulated by the VL, i.e., taken as 0

  • verbose_compilation (bool): whether to show compilation information

Returns:

  • Circuit: The compiled Circuit.


method decision_function

decision_function(X: ndarray, execute_in_fhe: bool = False) → ndarray

Predict confidence scores for samples.

Args:

  • X (numpy.ndarray): Samples to predict.

  • execute_in_fhe (bool): If True, the inference will be executed in FHE. Default to False.

Returns:

  • numpy.ndarray: Confidence scores for samples.


method dequantize_output

dequantize_output(q_y_preds: ndarray) → ndarray

Dequantize the output.

Args:

  • q_y_preds (numpy.ndarray): The quantized output to dequantize

Returns:

  • numpy.ndarray: The dequantized output


method fit

fit(X, y: ndarray, *args, **kwargs) → Any

Fit the FHE linear model.

Args:

  • X : Training data By default, you should be able to pass: * numpy arrays * torch tensors * pandas DataFrame or Series

  • y (numpy.ndarray): The target data.

  • *args: The arguments to pass to the sklearn linear model.

  • **kwargs: The keyword arguments to pass to the sklearn linear model.

Returns: Any


method fit_benchmark

fit_benchmark(
    X: ndarray,
    y: ndarray,
    *args,
    random_state: Optional[int] = None,
    **kwargs
) → Tuple[Any, Any]

Fit the sklearn linear model and the FHE linear model.

Args:

  • X (numpy.ndarray): The input data.

  • y (numpy.ndarray): The target data. random_state (Optional[Union[int, numpy.random.RandomState, None]]): The random state. Defaults to None.

  • *args: The arguments to pass to the sklearn linear model. or not (False). Default to False.

  • *args: Arguments for super().fit

  • **kwargs: Keyword arguments for super().fit

Returns: Tuple[SklearnLinearModelMixin, sklearn.linear_model.LinearRegression]: The FHE and sklearn LinearRegression.


method post_processing

post_processing(y_preds: ndarray, already_dequantized: bool = False)

Post-processing the predictions.

This step may include a dequantization of the inputs if not done previously, in particular within the client-server workflow.

Args:

  • y_preds (numpy.ndarray): The predictions to post-process.

  • already_dequantized (bool): Whether the inputs were already dequantized or not. Default to False.

Returns:

  • numpy.ndarray: The post-processed predictions.


method predict

predict(X: ndarray, execute_in_fhe: bool = False) → ndarray

Predict on user data.

Predict on user data using either the quantized clear model, implemented with tensors, or, if execute_in_fhe is set, using the compiled FHE circuit.

Args:

  • X (numpy.ndarray): Samples to predict.

  • execute_in_fhe (bool): If True, the inference will be executed in FHE. Default to False.

Returns:

  • numpy.ndarray: The prediction as ordinals.


method predict_proba

predict_proba(X: ndarray, execute_in_fhe: bool = False) → ndarray

Predict class probabilities for samples.

Args:

  • X (numpy.ndarray): Samples to predict.

  • execute_in_fhe (bool): If True, the inference will be executed in FHE. Default to False.

Returns:

  • numpy.ndarray: Class probabilities for samples.


method quantize_input

quantize_input(X: ndarray)

Quantize the input.

Args:

  • X (numpy.ndarray): The input to quantize

Returns:

  • numpy.ndarray: The quantized input

Last updated