This document introduces several common techniques for optimizing code to fit Fully Homomorphic Encryption (FHE) constraints. The examples provided demonstrate various workarounds and performance optimizations that you can implement while working with the Concrete library.
All code snippets provided here are temporary workarounds. In future versions of Concrete, some functions described here could be directly available in a more generic and efficient form. These code snippets are coming from support answers in our community forum
This example demonstrates how to retrieve a value from an array using an encrypted index. The method creates a "selection" array filled with 0
s except for the requested index, which will be 1
. It then sums the products of all array values with this selection array:
This example filters an encrypted array with an encrypted condition, in this case a greater than
comparison with an encrypted value. It packs all values with a selection bit that results from the comparison, allowing the unpacking of only the filtered values:
This example introduces a key concept when using Concrete: maximizing parallelization. Instead of sequentially summing all values to compute a mean, the values are split into sub-groups, and the mean of these sub-group means is computed:
This document provides an overview of the bit extraction feature in Concrete, including usage examples, limitations, and performance considerations.
Bit extraction could be useful in some applications that require directly manipulating bits of integers. Bit extraction allows you to extract a specific slice of bits from an integer, where index 0 corresponds to the least significant bit (LSB). The cost of this operation increases with the index of the highest significant bit you wish to extract.
Bit extraction only works in the Native
encoding, which is usually selected when all table lookups in the circuit are less than or equal to 8 bits.
You can use slices for indexing fhe.bits(value)
:
Bit extraction supports slices with negative steps:
Bit extraction supports signed integers:
Here's a practical example that uses bit extraction to determine if a number is even:
It prints:
Negative indexing is not supported: Bits extraction using negative indices is not supported, such as fhe.bits(x)[-1]
.
This is because the bit-width of x
is unknown before inputset evaluation, making it impossible to determine the correct bit to extract.
Reverse slicing requires explicit starting bit: When extracting bits in reverse order (using a negative step), the start bit must be specified, for example, fhe.bits(x)[::-1]
is not supported.
Signed integer slicing requires explicit stopping bit: For signed integers, when using slices, the stop bit must be explicitly provided, for example, fhe.bits(x)[1:]
is not supported.
Float bit extraction is not supported: While Concrete supports floats to some extent, bit extraction is not possible on float types.
Extracting a specific bit requires clearing all the preceding lower bits. This involves extracting these previous bits as intermediate values and then subtracting them from the input.
Implications:
Bits are extracted sequentially, starting from the least significant bit to the more significant ones. The cost is proportional to the index of the highest extracted bit plus one.
No parallelization is possible. The computation time is proportional to the cost, independent of the number of CPUs.
Examples:
Extracting fhe.bits(x)[4]
is approximately five times costlier than extracting fhe.bits(x)[0]
.
Extracting fhe.bits(x)[4]
takes around five times more wall clock time than fhe.bits(x)[0]
.
The cost of extracting fhe.bits(x)[0:5]
is almost the same as that of fhe.bits(x)[5]
.
Common sub-expression elimination is applied to intermediate extracted bits.
Implications:
The overall cost for a series of fhe.bits(x)[m:n]
calls on the same input x
is almost equivalent to the cost of the single most computationally expensive extraction in the series, i.e. fhe.bits(x)[n]
.
The order of extraction in that series does not affect the overall cost.
Example:
The combined operation fhe.bit(x)[3] + fhe.bit(x)[2] + fhe.bit(x)[1]
has almost the same cost as fhe.bits(x)[3]
.
Each extracted bit incurs a cost of approximately one TLU of 1-bit input precision. Therefore, fhe.bits(x)[0]
is generally faster than any other TLU operation.
This document introduces some extensions of Concrete, including functions for wrapping univariate and multivariate functions, performing convolution and maxpool operations, creating encrypted arrays, and more.
Wraps any univariate function into a single table lookup:
The wrapped function must follow these criteria:
No side effects: For example, no modification of global state
Deterministic: For example, no random number generation.
Shape consistency: output.shape
should be the same with input.shape
Element-wise mapping: Each output element must correspond to a single input element, for example. output[0]
should only depend on input[0]
of all inputs.
Violating these constraints may result in undefined outcome.
Wraps any multivariate function into a table lookup:
The wrapped functions must follow these criteria:
No side effects: For example, avoid modifying global state.
Deterministic: For example, no random number generation.
Broadcastable shapes: input.shape
should be broadcastable to output.shape
for all inputs.
Element-wise mapping: Each output element must correspond to a single input element, for example, output[0]
should only depend on input[0]
of all inputs.
Violating these constraints may result in undefined outcome.
Multivariate functions cannot be called with rounded inputs.
Perform a convolution operation, with the same semantic as onnx.Conv:
Only 2D convolutions without padding and with one group are currently supported.
Perform a maxpool operation, with the same semantic as onnx.MaxPool:
Only 2D maxpooling without padding and up to 15-bits is currently supported.
Create encrypted arrays:
Currently, only scalars can be used to create arrays.
Create an encrypted scalar zero:
Create an encrypted tensor of zeros:
Create an encrypted scalar one:
Create an encrypted tensor of ones:
Allows you to create an encrypted constant of a given value.
This extension is also compatible with constant arrays.
Hint properties of a value. Imagine you have this circuit:
You'd expect all of a
, b
, and c
to be 8-bits, but because inputset is very small, this code could print:
The first solution in these cases should be to use a bigger inputset, but it can still be tricky to solve with the inputset. That's where the hint
extension comes into play. Hints are a way to provide extra information to compilation process:
Bit-width hints are for constraining the minimum number of bits in the encoded value. If you hint a value to be 8-bits, it means it should be at least uint8
or int8
.
To fix f
using hints, you can do:
Hints are only applied to the value being hinted, and no other value. If you want the hint to be applied to multiple values, you need to hint all of them.
you'll always see:
regardless of the bounds.
Alternatively, you can use it to make sure a value can store certain integers:
Perform ReLU operation, with the same semantic as x if x >= 0 else 0
:
The ReLU operation can be implemented in two ways:
Single TLU (Table Lookup Unit) on the original bit-width: Suitable for small bit-widths, as it requires fewer resources.
Multiple TLUs on smaller bit-widths: Better for large bit-widths, avoiding the high cost of a single large TLU.
The method of conversion is controlled by the relu_on_bits_threshold: int = 7
option. For example, setting relu_on_bits_threshold=5
means:
Bit-widths from 1 to 4 will use a single TLU.
Bit-widths of 5 and above will use multiple TLUs.
Another option to fine-tune the implementation is relu_on_bits_chunk_size: int = 2
. For example, setting relu_on_bits_chunk_size=4
means that when using second implementation (using chunks), the input is split to 4-bit chunks using fhe.bits, and then the ReLU is applied to those chunks, which are then combined back.
Here is a script showing how execution cost is impacted when changing these values:
You might need to run the script twice to avoid crashing when plotting.
The script will show the following figure:
The default values of these options are set based on simple circuits. How they affect performance will depend on the circuit, so play around with them to get the most out of this extension.
Conversion with the second method (using chunks) only works in Native
encoding, which is usually selected when all table lookups in the circuit are below or equal to 8 bits.
Perform ternary if operation, with the same semantic as x if condition else y
:
fhe.if_then_else
is just an alias for np.where.
Copy the value:
The fhe.identity
extension is useful for cloning an input with a different bit-width.
Identity extension only works in Native
encoding, which is usually selected when all table lookups in the circuit are below or equal to 8 bits.
It is similar to fhe.identity
but with the extra guarantee that encryption noise is refreshed.
Refresh is useful when you want to control precisely where encryption noise is refreshed in your circuit. For instance if you are using modules, sometimes compilation rejects the module because it's not composable. This happens because a function of the module never refresh the encryption noise. Adding a return fhe.refresh(result)
on the function result solves the issue.
Refresh extension only works in Native
encoding, which is usually selected when all table lookups in the circuit are below or equal to 8 bits.
Create a random inputset with the given specifications:
The result will have 100 inputs by default which can be customized using the size keyword argument: