Concrete ML
WebsiteLibrariesProducts & ServicesDevelopersSupport
0.2
0.2
  • What is Concrete ML?
  • Installing
    • Installing
  • How To
    • Scikit-learn
    • Torch
    • Compute with Quantized Functions
    • Use Concrete ML ONNX Support
    • Debug / Get Support / Submit Issues
  • Advanced examples
    • Advanced examples
  • Explanations
    • Philosophy of the Design
    • Quantization
    • Pruning
    • Virtual Lib
    • Resources
  • Developper How To
    • Set Up the Project
    • Set Up Docker
    • Document
    • Create a Release on GitHub
    • Contribute
  • Developper Explanations
    • Concrete Stack
    • Quantization
    • Using ONNX as IR for FHE Compilation
    • Hummingbird Usage
    • Skorch Usage
  • API
    • API
Powered by GitBook

Libraries

  • TFHE-rs
  • Concrete
  • Concrete ML
  • fhEVM

Developers

  • Blog
  • Documentation
  • Github
  • FHE resources

Company

  • About
  • Introduction to FHE
  • Media
  • Careers
On this page

Was this helpful?

Export as PDF
  1. Explanations

Pruning

PreviousQuantizationNextVirtual Lib

Last updated 2 years ago

Was this helpful?

In neural networks, a neuron computes a linear combination of inputs and learned weights, then applies an activation function.

Artificial Neuron (from: )

The neuron computes:

yk=ϕ(∑iwixi)y_k = \phi\left(\sum_i w_ix_i\right)yk​=ϕ(∑i​wi​xi​)

When building a full neural network, each layer will contain multiple neurons, which are connected to the neuron outputs of a previous layer or to the inputs.

Fully Connected Neural Network

For every neuron shown in each layer of the figure above, the linear combinations of inputs and learned weights are computed. Depending on the values of the inputs and weights, the sum vk=∑iwixiv_k = \sum_i w_ix_ivk​=∑i​wi​xi​ - which, for Concrete-ML neural networks, is computed with integers - can take a range of different values.

To respect the bit width constraint of the mechanism, implemented with programmable bootstrapping, the values of the accumulator vkv_kvk​ must remain small to be representable with only 7 bits. In other words, the values must be between 0 and 127.

Pruning a neural network entails fixing some of the weights wkw_kwk​ to be zero during training. This is advantageous to meet FHE constraints, as, irrespective of the distribution of xix_ixi​, multiplying these input values by 0 does not increase the accumulator value.

Fixing some of the weights to 0 makes the network graph look more similar to the following:

Pruned Fully Connected Neural Network

Pruning weights can reduce the prediction performance of the neural network, but studies show that a high level of pruning (above 50%, see Han, Song & Pool, Jeff & Tran, John & Dally, William. (2015). Learning both Weights and Connections for Efficient Neural Networks) can be applied. In Concrete-ML, we implement with pruning, as described in the .

Table Lookup
Fully Connected Neural Networks
developer guide
wikipedia
Artificial Neuron
Neural Network
Neural Network