Concrete ML
WebsiteLibrariesProducts & ServicesDevelopersSupport
1.1
1.1
  • What is Concrete ML?
  • Getting Started
    • Installation
    • Key Concepts
    • Inference in the Cloud
    • Demos and Tutorials
  • Built-in Models
    • Linear Models
    • Tree-based Models
    • Neural Networks
    • Pandas
    • Built-in Model Examples
  • Deep Learning
    • Using Torch
    • Using ONNX
    • Step-by-step Guide
    • Deep Learning Examples
    • Debugging Models
    • Optimizing Inference
  • Advanced topics
    • Quantization
    • Pruning
    • Compilation
    • Prediction with FHE
    • Production Deployment
    • Advanced Features
    • Serialization
  • Developer Guide
    • Workflow
      • Set Up the Project
      • Set Up Docker
      • Documentation
      • Support and Issues
      • Contributing
    • Inner Workings
      • Importing ONNX
      • Quantization Tools
      • FHE Op-graph Design
      • External Libraries
    • API
      • concrete.ml.common.check_inputs.md
      • concrete.ml.common.debugging.custom_assert.md
      • concrete.ml.common.debugging.md
      • concrete.ml.common.md
      • concrete.ml.common.serialization.decoder.md
      • concrete.ml.common.serialization.dumpers.md
      • concrete.ml.common.serialization.encoder.md
      • concrete.ml.common.serialization.loaders.md
      • concrete.ml.common.serialization.md
      • concrete.ml.common.utils.md
      • concrete.ml.deployment.deploy_to_aws.md
      • concrete.ml.deployment.deploy_to_docker.md
      • concrete.ml.deployment.fhe_client_server.md
      • concrete.ml.deployment.md
      • concrete.ml.deployment.server.md
      • concrete.ml.deployment.utils.md
      • concrete.ml.onnx.convert.md
      • concrete.ml.onnx.md
      • concrete.ml.onnx.onnx_impl_utils.md
      • concrete.ml.onnx.onnx_model_manipulations.md
      • concrete.ml.onnx.onnx_utils.md
      • concrete.ml.onnx.ops_impl.md
      • concrete.ml.pytest.md
      • concrete.ml.pytest.torch_models.md
      • concrete.ml.pytest.utils.md
      • concrete.ml.quantization.base_quantized_op.md
      • concrete.ml.quantization.md
      • concrete.ml.quantization.post_training.md
      • concrete.ml.quantization.quantized_module.md
      • concrete.ml.quantization.quantized_ops.md
      • concrete.ml.quantization.quantizers.md
      • concrete.ml.search_parameters.md
      • concrete.ml.search_parameters.p_error_search.md
      • concrete.ml.sklearn.base.md
      • concrete.ml.sklearn.glm.md
      • concrete.ml.sklearn.linear_model.md
      • concrete.ml.sklearn.md
      • concrete.ml.sklearn.qnn.md
      • concrete.ml.sklearn.qnn_module.md
      • concrete.ml.sklearn.rf.md
      • concrete.ml.sklearn.svm.md
      • concrete.ml.sklearn.tree.md
      • concrete.ml.sklearn.tree_to_numpy.md
      • concrete.ml.sklearn.xgb.md
      • concrete.ml.torch.compile.md
      • concrete.ml.torch.md
      • concrete.ml.torch.numpy_module.md
      • concrete.ml.version.md
Powered by GitBook

Libraries

  • TFHE-rs
  • Concrete
  • Concrete ML
  • fhEVM

Developers

  • Blog
  • Documentation
  • Github
  • FHE resources

Company

  • About
  • Introduction to FHE
  • Media
  • Careers
On this page
  • module concrete.ml.sklearn.qnn_module
  • Global Variables
  • class SparseQuantNeuralNetwork

Was this helpful?

Export as PDF
  1. Developer Guide
  2. API

concrete.ml.sklearn.qnn_module.md

Previousconcrete.ml.sklearn.qnn.mdNextconcrete.ml.sklearn.rf.md

Last updated 1 year ago

Was this helpful?

module concrete.ml.sklearn.qnn_module

Sparse Quantized Neural Network torch module.

Global Variables

  • MAX_BITWIDTH_BACKWARD_COMPATIBLE


class SparseQuantNeuralNetwork

Sparse Quantized Neural Network.

This class implements an MLP that is compatible with FHE constraints. The weights and activations are quantized to low bit-width and pruning is used to ensure accumulators do not surpass an user-provided accumulator bit-width. The number of classes and number of layers are specified by the user, as well as the breadth of the network

method __init__

__init__(
    input_dim: int,
    n_layers: int,
    n_outputs: int,
    n_hidden_neurons_multiplier: int = 4,
    n_w_bits: int = 3,
    n_a_bits: int = 3,
    n_accum_bits: int = 8,
    n_prune_neurons_percentage: float = 0.0,
    activation_function: Type = <class 'torch.nn.modules.activation.ReLU'>,
    quant_narrow: bool = False,
    quant_signed: bool = True
)

Sparse Quantized Neural Network constructor.

Args:

  • input_dim (int): Number of dimensions of the input data.

  • n_layers (int): Number of linear layers for this network.

  • n_outputs (int): Number of output classes or regression targets.

  • n_w_bits (int): Number of weight bits.

  • n_a_bits (int): Number of activation and input bits.

  • n_accum_bits (int): Maximal allowed bit-width of intermediate accumulators.

  • n_hidden_neurons_multiplier (int): The number of neurons on the hidden will be the number of dimensions of the input multiplied by n_hidden_neurons_multiplier. Note that pruning is used to adjust the accumulator size to attempt to keep the maximum accumulator bit-width to n_accum_bits, meaning that not all hidden layer neurons will be active. The default value for n_hidden_neurons_multiplier is chosen for small dimensions of the input. Reducing this value decreases the FHE inference time considerably but also decreases the robustness and accuracy of model training.

  • n_prune_neurons_percentage (float): The percentage of neurons to prune in the hidden layers. This can be used when setting n_hidden_neurons_multiplier with a high number (3-4), once good accuracy is obtained, in order to speed up the model in FHE.

  • activation_function (Type): The activation function to use in the network (e.g., torch.ReLU, torch.SELU, torch.Sigmoid, ...).

  • quant_narrow (bool): Whether this network should quantize the values using narrow range (e.g a 2-bits signed quantization uses [-1, 0, 1] instead of [-2, -1, 0, 1]).

  • quant_signed (bool): Whether this network should quantize the values using signed integers.

Raises:

  • ValueError: If the parameters have invalid values or the computed accumulator bit-width is zero.


method enable_pruning

enable_pruning() → None

Enable pruning in the network. Pruning must be made permanent to recover pruned weights.

Raises:

  • ValueError: If the quantization parameters are invalid.


method forward

forward(x: Tensor) → Tensor

Forward pass.

Args:

  • x (torch.Tensor): network input

Returns:

  • x (torch.Tensor): network prediction


method make_pruning_permanent

make_pruning_permanent() → None

Make the learned pruning permanent in the network.


method max_active_neurons

max_active_neurons() → int

Compute the maximum number of active (non-zero weight) neurons.

The computation is done using the quantization parameters passed to the constructor. Warning: With the current quantization algorithm (asymmetric) the value returned by this function is not guaranteed to ensure FHE compatibility. For some weight distributions, weights that are 0 (which are pruned weights) will not be quantized to 0. Therefore the total number of active quantized neurons will not be equal to max_active_neurons.

Returns:

  • int: The maximum number of active neurons.