1 of 4

Advanced features

Bit extraction

Some applications require directly manipulating bits of integers. Concrete provides a bit extraction operation for such applications.

Bit extraction is capable of extracting a slice of bits from an integer. Index 0 corresponds to the lowest significant bit. The cost of this operation is proportional to the highest significant bit index.

Bit extraction only works in the Native encoding, which is usually selected when all table lookups in the circuit are less than or equal to 8 bits.

Slices can be used for indexing fhe.bits(value) as well.

Even slices with negative steps are supported!

Signed integers are supported as well.

Lastly, here is a practical use case of bit extraction.

prints

Limitations

Bits cannot be extracted using a negative index.
- Which means fhe.bits(x)[-1] or fhe.bits(x)[-4:-1] is not supported for example.
- The reason for this is that we don't know in advance (i.e., before inputset evaluation) how many bits x has.
  - For example, let's say you have x == 10 == 0b_000...0001010, and you want to do fhe.bits(x)[-1]. If the value is 4-bits (i.e., 0b_1010), the result needs to be 1, but if it's 6-bits (i.e., 0b_001010), the result needs to be 0. Since we don't know the bit-width of x before inputset evaluation, we cannot calculate fhe.bits(x)[-1].
When extracting bits using slices in reverse order (i.e., step < 0), the start bit needs to be provided explicitly.
- Which means fhe.bits(x)[::-1] or fhe.bits(x)[:2:-1] is not supported for example.
- The reason is the same as above.
When extracting bits of signed values using slices, the stop bit needs to be provided explicitly.
- Which means fhe.bits(x)[1:] or fhe.bits(x)[1::2] is not supported for example.
- The reason is similar to above.
Bits of floats cannot be extracted.
- Floats are partially supported but extracting their bits is not supported at all.

Performance Considerations

A Chain of Individual Bit Extractions

Key Concept: Extracting a specific bit requires clearing all the preceding lower bits. This involves extracting these previous bits as intermediate values and then subtracting them from the input.

Implications:

Bits are extracted sequentially, starting from the least significant bit to the more significant ones. The cost is proportional to the index of the highest extracted bit plus one.
No parallelization is possible. The computation time is proportional to the cost, independent of the number of CPUs.

Examples:

Extracting fhe.bits(x)[4] is approximately five times costlier than extracting fhe.bits(x)[0].
Extracting fhe.bits(x)[4] takes around five times more wall clock time than fhe.bits(x)[0].
The cost of extracting fhe.bits(x)[0:5] is almost the same as that of fhe.bits(x)[5].

Reuse of Intermediate Extracted Bits

Key Concept: Common sub-expression elimination is applied to intermediate extracted bits.

Implications:

The overall cost for a series of fhe.bits(x)[m:n] calls on the same input x is almost equivalent to the cost of the single most computationally expensive extraction in the series, i.e. fhe.bits(x)[n].
The order of extraction in that series does not affect the overall cost.

Example:

The combined operation fhe.bit(x)[3] + fhe.bit(x)[2] + fhe.bit(x)[1] has almost the same cost as fhe.bits(x)[3].

TLUs of 1b input precision

Each extracted bit incurs a cost of approximately one TLU of 1-bit input precision. Therefore, fhe.bits(x)[0] is generally faster than any other TLU operation.

Common tips

As explained in the , the challenge for developers is to adapt their code to fit FHE constraints. In this document we have collected some common examples to illustrate the kind of optimization one can do to get better performance.

All code snippets provided here are temporary workarounds. In future versions of Concrete, some functions described here could be directly available in a more generic and efficient form. These code snippets are coming from support answers in our

Minimum for Two values

In this first example, we compute a minimum by creating the difference between two numbers y and x and conditionally remove this diff from y to either get x if y>x or y if x>y:

Maximum for Two values

The companion example of above with the maximum value of two integers instead of the minimum:

Minimum for several values

And an extension for more than two values:

Retrieving a value within an encrypted array with an encrypted index

This example shows how to deal with an array and an encrypted index. It will create a "selection" array filled with 0 except for the requested index that will be 1, and sum the products of all array values by this selection array:

Filter an array with comparison (>)

This example filters an encrypted array with an encrypted condition, here a greater than with an encrypted value. It packs all values with a selection bit, resulting from the comparison that allow the unpacking of only the filtered values:

Matrix Row/Col means

In this example Matrix operation, we are introducing a key concept when using Concrete: trying to maximize the parallelization. Here instead of sequentially summing all values to create a mean value, we split the values in sub-groups, and do the mean of the sub-group means:

Extensions

Concrete supports native Python and NumPy operations as much as possible, but not everything in Python or NumPy is available. Therefore, we provide some extensions ourselves to improve your experience.

fhe.univariate(function)

Allows you to wrap any univariate function into a single table lookup:

import numpy as np
from concrete import fhe

def complex_univariate_function(x):

    def per_element(element):
        result = 0
        for i in range(element):
            result += i
        return result

    return np.vectorize(per_element)(x)

@fhe.compiler({"x": "encrypted"})
def f(x):
    return fhe.univariate(complex_univariate_function)(x)

inputset = [np.random.randint(0, 5, size=(3, 2)) for _ in range(10)]
circuit = f.compile(inputset)

sample = np.array([
    [0, 4],
    [2, 1],
    [3, 0],
])
assert np.array_equal(circuit.encrypt_run_decrypt(sample), complex_univariate_function(sample))

The wrapped function:

shouldn't have any side effects (e.g., no modification of global state)
should be deterministic (e.g., no random numbers)
should have the same output shape as its input (i.e., output.shape should be the same with input.shape)
each output element should correspond to a single input element (e.g., output[0] should only depend on input[0])

If any of these constraints are violated, the outcome is undefined.

fhe.multivariate(function)

Allows you to wrap any multivariate function into a table lookup:

import numpy as np
from concrete import fhe

def value_if_condition_else_zero(value, condition):
    return value if condition else np.zeros_like(value, dtype=np.int64)

def function(x, y):
    return fhe.multivariate(value_if_condition_else_zero)(x, y)

inputset = [
    (
        np.random.randint(-2**4, 2**4, size=(2, 2)),
        np.random.randint(0, 2**1, size=()),
    )
    for _ in range(100)
]

compiler = fhe.Compiler(function, {"x": "encrypted", "y": "encrypted"})
circuit = compiler.compile(inputset)

sample = [np.array([[-2, 4], [0, 1]]), 0]
assert np.array_equal(circuit.encrypt_run_decrypt(*sample), function(*sample))

sample = [np.array([[3, -1], [2, 4]]), 1]
assert np.array_equal(circuit.encrypt_run_decrypt(*sample), function(*sample))

The wrapped function:

shouldn't have any side effects (e.g., no modification of global state)
should be deterministic (e.g., no random numbers)
should have input shapes which are broadcastable to the output shape (i.e., input.shape should be broadcastable to output.shape for all inputs)
each output element should correspond to a single input element (e.g., output[0] should only depend on input[0] of all inputs)

If any of these constraints are violated, the outcome is undefined.

Multivariate functions cannot be called with rounded inputs.

fhe.conv(...)

Allows you to perform a convolution operation, with the same semantic as onnx.Conv:

import numpy as np
from concrete import fhe

weight = np.array([[2, 1], [3, 2]]).reshape(1, 1, 2, 2)

@fhe.compiler({"x": "encrypted"})
def f(x):
    return fhe.conv(x, weight, strides=(2, 2), dilations=(1, 1), group=1)

inputset = [np.random.randint(0, 4, size=(1, 1, 4, 4)) for _ in range(10)]
circuit = f.compile(inputset)

sample = np.array(
    [
        [3, 2, 1, 0],
        [3, 2, 1, 0],
        [3, 2, 1, 0],
        [3, 2, 1, 0],
    ]
).reshape(1, 1, 4, 4)
assert np.array_equal(circuit.encrypt_run_decrypt(sample), f(sample))

Only 2D convolutions without padding and with one group are currently supported.

fhe.maxpool(...)

Allows you to perform a maxpool operation, with the same semantic as onnx.MaxPool:

import numpy as np
from concrete import fhe

@fhe.compiler({"x": "encrypted"})
def f(x):
    return fhe.maxpool(x, kernel_shape=(2, 2), strides=(2, 2), dilations=(1, 1))

inputset = [np.random.randint(0, 4, size=(1, 1, 4, 4)) for _ in range(10)]
circuit = f.compile(inputset)

sample = np.array(
    [
        [3, 2, 1, 0],
        [3, 2, 1, 0],
        [3, 2, 1, 0],
        [3, 2, 1, 0],
    ]
).reshape(1, 1, 4, 4)
assert np.array_equal(circuit.encrypt_run_decrypt(sample), f(sample))

Only 2D maxpooling without padding and up to 15-bits is currently supported.

fhe.array(...)

Allows you to create encrypted arrays:

import numpy as np
from concrete import fhe

@fhe.compiler({"x": "encrypted", "y": "encrypted"})
def f(x, y):
    return fhe.array([x, y])

inputset = [(3, 2), (7, 0), (0, 7), (4, 2)]
circuit = f.compile(inputset)

sample = (3, 4)
assert np.array_equal(circuit.encrypt_run_decrypt(*sample), f(*sample))

Currently, only scalars can be used to create arrays.

fhe.zero()

Allows you to create an encrypted scalar zero:

from concrete import fhe
import numpy as np

@fhe.compiler({"x": "encrypted"})
def f(x):
    z = fhe.zero()
    return x + z

inputset = range(10)
circuit = f.compile(inputset)

for x in range(10):
    assert circuit.encrypt_run_decrypt(x) == x

fhe.zeros(shape)

Allows you to create an encrypted tensor of zeros:

from concrete import fhe
import numpy as np

@fhe.compiler({"x": "encrypted"})
def f(x):
    z = fhe.zeros((2, 3))
    return x + z

inputset = range(10)
circuit = f.compile(inputset)

for x in range(10):
    assert np.array_equal(circuit.encrypt_run_decrypt(x), np.array([[x, x, x], [x, x, x]]))

fhe.one()

Allows you to create an encrypted scalar one:

from concrete import fhe
import numpy as np

@fhe.compiler({"x": "encrypted"})
def f(x):
    z = fhe.one()
    return x + z

inputset = range(10)
circuit = f.compile(inputset)

for x in range(10):
    assert circuit.encrypt_run_decrypt(x) == x + 1

fhe.ones(shape)

Allows you to create an encrypted tensor of ones:

from concrete import fhe
import numpy as np

@fhe.compiler({"x": "encrypted"})
def f(x):
    z = fhe.ones((2, 3))
    return x + z

inputset = range(10)
circuit = f.compile(inputset)

for x in range(10):
    assert np.array_equal(circuit.encrypt_run_decrypt(x), np.array([[x, x, x], [x, x, x]]) + 1)

fhe.hint(value, **kwargs)

Allows you to hint properties of a value. Imagine you have this circuit:

from concrete import fhe
import numpy as np

@fhe.compiler({"x": "encrypted"})
def f(x, y, z):
    a = x | y
    b = y & z
    c = a ^ b
    return c

inputset = [
    (np.random.randint(0, 2**8), np.random.randint(0, 2**8), np.random.randint(0, 2**8))
    for _ in range(3)
]
circuit = f.compile(inputset)

print(circuit)

You'd expect all of a, b, and c to be 8-bits, but because inputset is very small, this code could print:

%0 = x                          # EncryptedScalar<uint8>        ∈ [173, 240]
%1 = y                          # EncryptedScalar<uint8>        ∈ [52, 219]
%2 = z                          # EncryptedScalar<uint8>        ∈ [36, 252]
%3 = bitwise_or(%0, %1)         # EncryptedScalar<uint8>        ∈ [243, 255]
%4 = bitwise_and(%1, %2)        # EncryptedScalar<uint7>        ∈ [0, 112] 
                                                  ^^^^^ this can lead to bugs
%5 = bitwise_xor(%3, %4)        # EncryptedScalar<uint8>        ∈ [131, 255]
return %5

The first solution in these cases should be to use a bigger inputset, but it can still be tricky to solve with the inputset. That's where the hint extension comes into play. Hints are a way to provide extra information to compilation process:

Bit-width hints are for constraining the minimum number of bits in the encoded value. If you hint a value to be 8-bits, it means it should be at least uint8 or int8.

To fix f using hints, you can do:

@fhe.compiler({"x": "encrypted", "y": "encrypted", "z": "encrypted"})
def f(x, y, z):
    # hint that inputs should be considered at least 8-bits
    x = fhe.hint(x, bit_width=8)
    y = fhe.hint(y, bit_width=8)
    z = fhe.hint(z, bit_width=8)

    # hint that intermediates should be considered at least 8-bits
    a = fhe.hint(x | y, bit_width=8)
    b = fhe.hint(y & z, bit_width=8)
    c = fhe.hint(a ^ b, bit_width=8)

    return c

Hints are only applied to the value being hinted, and no other value. If you want the hint to be applied to multiple values, you need to hint all of them.

you'll always see:

%0 = x                          # EncryptedScalar<uint8>        ∈ [...]
%1 = y                          # EncryptedScalar<uint8>        ∈ [...]
%2 = z                          # EncryptedScalar<uint8>        ∈ [...]
%3 = bitwise_or(%0, %1)         # EncryptedScalar<uint8>        ∈ [...]
%4 = bitwise_and(%1, %2)        # EncryptedScalar<uint8>        ∈ [...] 
%5 = bitwise_xor(%3, %4)        # EncryptedScalar<uint8>        ∈ [...]
return %5

regardless of the bounds.

Alternatively, you can use it to make sure a value can store certain integers:

@fhe.compiler({"x": "encrypted", "y": "encrypted"})
def is_vectors_same(x, y):
    assert x.ndim != 1
    assert y.ndim != 1
    
    assert len(x) == len(y)
    n = len(x)
    
    number_of_same_elements = np.sum(x == y)
    fhe.hint(number_of_same_elements, can_store=n)  # hint that number of same elements can go up to n
    is_same = number_of_same_elements == n

    return is_same

fhe.relu(value)

Allows you to perform ReLU operation, with the same semantic as x if x >= 0 else 0:

import numpy as np
from concrete import fhe

@fhe.compiler({"x": "encrypted"})
def f(x):
    return fhe.relu(x)

inputset = [np.random.randint(-10, 10) for _ in range(10)]
circuit = f.compile(inputset)

assert circuit.encrypt_run_decrypt(0) == 0
assert circuit.encrypt_run_decrypt(1) == 1
assert circuit.encrypt_run_decrypt(-1) == 0
assert circuit.encrypt_run_decrypt(-3) == 0
assert circuit.encrypt_run_decrypt(5) == 5

ReLU extension can be converted in two different ways:

With a single TLU on the original bit-width.
With multiple TLUs on smaller bit-widths.

For small bit-widths, the first one is better as it'll have a single TLU on a small bit-width. For big bit-widths, the second one is better as it won't have a TLU on a big bit-width.

The decision between the two can be controlled with relu_on_bits_threshold: int = 7 configuration option:

relu_on_bits_threshold=5 means:
- 1-bit to 4-bits would be converted using the first way (i.e., using TLU)
- 5-bits and more would be converted using the second way (i.e., using bits)

There is another option to customize the implementation relu_on_bits_chunk_size: int = 2:

relu_on_bits_chunk_size=4 means:
- When using the second implementation:
  - The input would be split to 4-bit chunks using fhe.bits, and then the ReLU would be applied to those chunks, which are then combined back.

Here is a script showing how execution cost is impacted when changing these values:

from concrete import fhe
import numpy as np
import matplotlib.pyplot as plt

chunk_sizes = np.array(range(1, 6), dtype=int)
bit_widths = np.array(range(5, 17), dtype=int)

data = []
for bit_width in bit_widths:
    title = f"{bit_width=}:"
    print(title)
    print("-" * len(title))

    inputset = range(-2**(bit_width-1), 2**(bit_width-1))
    configuration = fhe.Configuration(relu_on_bits_threshold=17)

    compiler = fhe.Compiler(lambda x: fhe.relu((fhe.relu(x) - (2**(bit_width-2))) * 2), {"x": "encrypted"})
    circuit = compiler.compile(inputset, configuration)

    print(f"    Complexity: {circuit.complexity} # tlu")
    data.append((bit_width, 0, circuit.complexity))

    for chunk_size in chunk_sizes:
        configuration = fhe.Configuration(
            relu_on_bits_threshold=1,
            relu_on_bits_chunk_size=int(chunk_size),
        )
        circuit = compiler.compile(inputset, configuration)

        print(f"    Complexity: {circuit.complexity} # {chunk_size=}")
        data.append((bit_width, chunk_size, circuit.complexity))

    print()

data = np.array(data)

plt.title(f"ReLU using TLU vs using bits")
plt.xlabel("Input/Output precision")
plt.ylabel("Cost")

for i, chunk_size in enumerate([0] + list(chunk_sizes)):
    costs = [
        cost
        for _, candidate_chunk_size, cost in data
        if candidate_chunk_size == chunk_size
    ]
    assert len(costs) == len(bit_widths)

    label = "Single TLU" if i == 0 else f"Bits extract + multiples {chunk_size + 1} bits TLUs"
    width_bar = 0.8 / (len(chunk_sizes) + 1)

    if i == 0:
        plt.hlines(
            costs,
            bit_widths - 0.45,
            bit_widths + 0.45,
            label=label,
            linestyle="--",
        )
    else:
        plt.bar(
            np.array(bit_widths) + width_bar * (i - (len(chunk_sizes) + 1) / 2),
            height=costs,
            width=width_bar,
            label=label,
        )

plt.xticks(bit_widths)
plt.legend(loc="upper left")

plt.show()

You might need to run the script twice to avoid crashing when plotting.

The script will show the following figure:

The default values of these options are set based on simple circuits. How they affect performance will depend on the circuit, so play around with them to get the most out of this extension.

Conversion with the second method (i.e., using chunks) only works in Native encoding, which is usually selected when all table lookups in the circuit are below or equal to 8 bits.

fhe.if_then_else(condition, x, y)

Allows you to perform ternary if operation, with the same semantic as x if condition else y:

import numpy as np
from concrete import fhe

@fhe.compiler({"condition": "encrypted", "x": "encrypted", "y": "encrypted"})
def f(condition, x, y):
    return fhe.if_then_else(condition, x, y)

inputset = [
    (
        np.random.randint(0, 2**1),
        np.random.randint(0, 2**5),
        np.random.randint(-2**3, 2**3),
    )
    for _ in range(10)
]
circuit = f.compile(inputset)

assert circuit.encrypt_run_decrypt(1, 3, 5) == 3
assert circuit.encrypt_run_decrypt(0, 3, 5) == 5
assert circuit.encrypt_run_decrypt(1, 3, -5) == 3
assert circuit.encrypt_run_decrypt(0, 3, -5) == -5

fhe.if_then_else is just an alias for np.where.

fhe.identity(value)

Allows you to copy the value:

import numpy as np
from concrete import fhe

@fhe.compiler({"x": "encrypted"})
def f(x):
    return fhe.identity(x)

inputset = [np.random.randint(-10, 10) for _ in range(10)]
circuit = f.compile(inputset)

assert circuit.encrypt_run_decrypt(0) == 0
assert circuit.encrypt_run_decrypt(1) == 1
assert circuit.encrypt_run_decrypt(-1) == -1
assert circuit.encrypt_run_decrypt(-3) == -3
assert circuit.encrypt_run_decrypt(5) == 5

Identity extension can be used to clone an input while changing its bit-width. Imagine you have return x**2, x+100 where x is 2-bits. Because of x+100, x will be assigned 7-bits and x**2 would be more expensive than it needs to be. If return x**2, fhe.identity(x)+100 is used instead, x will be assigned 2-bits as it should and fhe.identity(x) will be assigned 7-bits as necessary.

Identity extension only works in Native encoding, which is usually selected when all table lookups in the circuit are below or equal to 8 bits.

fhe.inputset(...)

Used for creating a random inputset with the given specifications:

inputset = fhe.inputset(fhe.uint4, fhe.tensor[fhe.int3, 3, 2], lambda index: custom_value(index))
assert isinstance(inputset, list)
assert all(isinstance(sample, tuple) and len(sample) == 3 for sample in inputset)

The result will have 100 inputs by default which can be customized using the size keyword argument:

inputset = fhe.inputset(fhe.uint4, fhe.uint4, size=10)
assert len(inputset) == 10

Extensions

fhe.univariate(function)

Allows you to wrap any univariate function into a single table lookup:

import numpy as np
from concrete import fhe

def complex_univariate_function(x):

    def per_element(element):
        result = 0
        for i in range(element):
            result += i
        return result

    return np.vectorize(per_element)(x)

@fhe.compiler({"x": "encrypted"})
def f(x):
    return fhe.univariate(complex_univariate_function)(x)

inputset = [np.random.randint(0, 5, size=(3, 2)) for _ in range(10)]
circuit = f.compile(inputset)

sample = np.array([
    [0, 4],
    [2, 1],
    [3, 0],
])
assert np.array_equal(circuit.encrypt_run_decrypt(sample), complex_univariate_function(sample))

The wrapped function:

shouldn't have any side effects (e.g., no modification of global state)
should be deterministic (e.g., no random numbers)
should have the same output shape as its input (i.e., output.shape should be the same with input.shape)
each output element should correspond to a single input element (e.g., output[0] should only depend on input[0])

If any of these constraints are violated, the outcome is undefined.

fhe.multivariate(function)

Allows you to wrap any multivariate function into a table lookup:

import numpy as np
from concrete import fhe

def value_if_condition_else_zero(value, condition):
    return value if condition else np.zeros_like(value, dtype=np.int64)

def function(x, y):
    return fhe.multivariate(value_if_condition_else_zero)(x, y)

inputset = [
    (
        np.random.randint(-2**4, 2**4, size=(2, 2)),
        np.random.randint(0, 2**1, size=()),
    )
    for _ in range(100)
]

compiler = fhe.Compiler(function, {"x": "encrypted", "y": "encrypted"})
circuit = compiler.compile(inputset)

sample = [np.array([[-2, 4], [0, 1]]), 0]
assert np.array_equal(circuit.encrypt_run_decrypt(*sample), function(*sample))

sample = [np.array([[3, -1], [2, 4]]), 1]
assert np.array_equal(circuit.encrypt_run_decrypt(*sample), function(*sample))

The wrapped function:

shouldn't have any side effects (e.g., no modification of global state)
should be deterministic (e.g., no random numbers)
should have input shapes which are broadcastable to the output shape (i.e., input.shape should be broadcastable to output.shape for all inputs)
each output element should correspond to a single input element (e.g., output[0] should only depend on input[0] of all inputs)

If any of these constraints are violated, the outcome is undefined.

Multivariate functions cannot be called with rounded inputs.

fhe.conv(...)

Allows you to perform a convolution operation, with the same semantic as onnx.Conv:

import numpy as np
from concrete import fhe

weight = np.array([[2, 1], [3, 2]]).reshape(1, 1, 2, 2)

@fhe.compiler({"x": "encrypted"})
def f(x):
    return fhe.conv(x, weight, strides=(2, 2), dilations=(1, 1), group=1)

inputset = [np.random.randint(0, 4, size=(1, 1, 4, 4)) for _ in range(10)]
circuit = f.compile(inputset)

sample = np.array(
    [
        [3, 2, 1, 0],
        [3, 2, 1, 0],
        [3, 2, 1, 0],
        [3, 2, 1, 0],
    ]
).reshape(1, 1, 4, 4)
assert np.array_equal(circuit.encrypt_run_decrypt(sample), f(sample))

Only 2D convolutions without padding and with one group are currently supported.

fhe.maxpool(...)

Allows you to perform a maxpool operation, with the same semantic as onnx.MaxPool:

import numpy as np
from concrete import fhe

@fhe.compiler({"x": "encrypted"})
def f(x):
    return fhe.maxpool(x, kernel_shape=(2, 2), strides=(2, 2), dilations=(1, 1))

inputset = [np.random.randint(0, 4, size=(1, 1, 4, 4)) for _ in range(10)]
circuit = f.compile(inputset)

sample = np.array(
    [
        [3, 2, 1, 0],
        [3, 2, 1, 0],
        [3, 2, 1, 0],
        [3, 2, 1, 0],
    ]
).reshape(1, 1, 4, 4)
assert np.array_equal(circuit.encrypt_run_decrypt(sample), f(sample))

Only 2D maxpooling without padding and up to 15-bits is currently supported.

fhe.array(...)

Allows you to create encrypted arrays:

import numpy as np
from concrete import fhe

@fhe.compiler({"x": "encrypted", "y": "encrypted"})
def f(x, y):
    return fhe.array([x, y])

inputset = [(3, 2), (7, 0), (0, 7), (4, 2)]
circuit = f.compile(inputset)

sample = (3, 4)
assert np.array_equal(circuit.encrypt_run_decrypt(*sample), f(*sample))

Currently, only scalars can be used to create arrays.

fhe.zero()

Allows you to create an encrypted scalar zero:

from concrete import fhe
import numpy as np

@fhe.compiler({"x": "encrypted"})
def f(x):
    z = fhe.zero()
    return x + z

inputset = range(10)
circuit = f.compile(inputset)

for x in range(10):
    assert circuit.encrypt_run_decrypt(x) == x

fhe.zeros(shape)

Allows you to create an encrypted tensor of zeros:

from concrete import fhe
import numpy as np

@fhe.compiler({"x": "encrypted"})
def f(x):
    z = fhe.zeros((2, 3))
    return x + z

inputset = range(10)
circuit = f.compile(inputset)

for x in range(10):
    assert np.array_equal(circuit.encrypt_run_decrypt(x), np.array([[x, x, x], [x, x, x]]))

fhe.one()

Allows you to create an encrypted scalar one:

from concrete import fhe
import numpy as np

@fhe.compiler({"x": "encrypted"})
def f(x):
    z = fhe.one()
    return x + z

inputset = range(10)
circuit = f.compile(inputset)

for x in range(10):
    assert circuit.encrypt_run_decrypt(x) == x + 1

fhe.ones(shape)

Allows you to create an encrypted tensor of ones:

from concrete import fhe
import numpy as np

@fhe.compiler({"x": "encrypted"})
def f(x):
    z = fhe.ones((2, 3))
    return x + z

inputset = range(10)
circuit = f.compile(inputset)

for x in range(10):
    assert np.array_equal(circuit.encrypt_run_decrypt(x), np.array([[x, x, x], [x, x, x]]) + 1)

fhe.hint(value, **kwargs)

Allows you to hint properties of a value. Imagine you have this circuit:

from concrete import fhe
import numpy as np

@fhe.compiler({"x": "encrypted"})
def f(x, y, z):
    a = x | y
    b = y & z
    c = a ^ b
    return c

inputset = [
    (np.random.randint(0, 2**8), np.random.randint(0, 2**8), np.random.randint(0, 2**8))
    for _ in range(3)
]
circuit = f.compile(inputset)

print(circuit)

You'd expect all of a, b, and c to be 8-bits, but because inputset is very small, this code could print:

%0 = x                          # EncryptedScalar<uint8>        ∈ [173, 240]
%1 = y                          # EncryptedScalar<uint8>        ∈ [52, 219]
%2 = z                          # EncryptedScalar<uint8>        ∈ [36, 252]
%3 = bitwise_or(%0, %1)         # EncryptedScalar<uint8>        ∈ [243, 255]
%4 = bitwise_and(%1, %2)        # EncryptedScalar<uint7>        ∈ [0, 112] 
                                                  ^^^^^ this can lead to bugs
%5 = bitwise_xor(%3, %4)        # EncryptedScalar<uint8>        ∈ [131, 255]
return %5

Bit-width hints are for constraining the minimum number of bits in the encoded value. If you hint a value to be 8-bits, it means it should be at least uint8 or int8.

To fix f using hints, you can do:

@fhe.compiler({"x": "encrypted", "y": "encrypted", "z": "encrypted"})
def f(x, y, z):
    # hint that inputs should be considered at least 8-bits
    x = fhe.hint(x, bit_width=8)
    y = fhe.hint(y, bit_width=8)
    z = fhe.hint(z, bit_width=8)

    # hint that intermediates should be considered at least 8-bits
    a = fhe.hint(x | y, bit_width=8)
    b = fhe.hint(y & z, bit_width=8)
    c = fhe.hint(a ^ b, bit_width=8)

    return c

Hints are only applied to the value being hinted, and no other value. If you want the hint to be applied to multiple values, you need to hint all of them.

you'll always see:

%0 = x                          # EncryptedScalar<uint8>        ∈ [...]
%1 = y                          # EncryptedScalar<uint8>        ∈ [...]
%2 = z                          # EncryptedScalar<uint8>        ∈ [...]
%3 = bitwise_or(%0, %1)         # EncryptedScalar<uint8>        ∈ [...]
%4 = bitwise_and(%1, %2)        # EncryptedScalar<uint8>        ∈ [...] 
%5 = bitwise_xor(%3, %4)        # EncryptedScalar<uint8>        ∈ [...]
return %5

regardless of the bounds.

Alternatively, you can use it to make sure a value can store certain integers:

@fhe.compiler({"x": "encrypted", "y": "encrypted"})
def is_vectors_same(x, y):
    assert x.ndim != 1
    assert y.ndim != 1
    
    assert len(x) == len(y)
    n = len(x)
    
    number_of_same_elements = np.sum(x == y)
    fhe.hint(number_of_same_elements, can_store=n)  # hint that number of same elements can go up to n
    is_same = number_of_same_elements == n

    return is_same

fhe.relu(value)

Allows you to perform ReLU operation, with the same semantic as x if x >= 0 else 0:

import numpy as np
from concrete import fhe

@fhe.compiler({"x": "encrypted"})
def f(x):
    return fhe.relu(x)

inputset = [np.random.randint(-10, 10) for _ in range(10)]
circuit = f.compile(inputset)

assert circuit.encrypt_run_decrypt(0) == 0
assert circuit.encrypt_run_decrypt(1) == 1
assert circuit.encrypt_run_decrypt(-1) == 0
assert circuit.encrypt_run_decrypt(-3) == 0
assert circuit.encrypt_run_decrypt(5) == 5

ReLU extension can be converted in two different ways:

With a single TLU on the original bit-width.
With multiple TLUs on smaller bit-widths.

For small bit-widths, the first one is better as it'll have a single TLU on a small bit-width. For big bit-widths, the second one is better as it won't have a TLU on a big bit-width.

The decision between the two can be controlled with relu_on_bits_threshold: int = 7 configuration option:

relu_on_bits_threshold=5 means:
- 1-bit to 4-bits would be converted using the first way (i.e., using TLU)
- 5-bits and more would be converted using the second way (i.e., using bits)

There is another option to customize the implementation relu_on_bits_chunk_size: int = 2:

relu_on_bits_chunk_size=4 means:
- When using the second implementation:
  - The input would be split to 4-bit chunks using fhe.bits, and then the ReLU would be applied to those chunks, which are then combined back.

Here is a script showing how execution cost is impacted when changing these values:

from concrete import fhe
import numpy as np
import matplotlib.pyplot as plt

chunk_sizes = np.array(range(1, 6), dtype=int)
bit_widths = np.array(range(5, 17), dtype=int)

data = []
for bit_width in bit_widths:
    title = f"{bit_width=}:"
    print(title)
    print("-" * len(title))

    inputset = range(-2**(bit_width-1), 2**(bit_width-1))
    configuration = fhe.Configuration(relu_on_bits_threshold=17)

    compiler = fhe.Compiler(lambda x: fhe.relu((fhe.relu(x) - (2**(bit_width-2))) * 2), {"x": "encrypted"})
    circuit = compiler.compile(inputset, configuration)

    print(f"    Complexity: {circuit.complexity} # tlu")
    data.append((bit_width, 0, circuit.complexity))

    for chunk_size in chunk_sizes:
        configuration = fhe.Configuration(
            relu_on_bits_threshold=1,
            relu_on_bits_chunk_size=int(chunk_size),
        )
        circuit = compiler.compile(inputset, configuration)

        print(f"    Complexity: {circuit.complexity} # {chunk_size=}")
        data.append((bit_width, chunk_size, circuit.complexity))

    print()

data = np.array(data)

plt.title(f"ReLU using TLU vs using bits")
plt.xlabel("Input/Output precision")
plt.ylabel("Cost")

for i, chunk_size in enumerate([0] + list(chunk_sizes)):
    costs = [
        cost
        for _, candidate_chunk_size, cost in data
        if candidate_chunk_size == chunk_size
    ]
    assert len(costs) == len(bit_widths)

    label = "Single TLU" if i == 0 else f"Bits extract + multiples {chunk_size + 1} bits TLUs"
    width_bar = 0.8 / (len(chunk_sizes) + 1)

    if i == 0:
        plt.hlines(
            costs,
            bit_widths - 0.45,
            bit_widths + 0.45,
            label=label,
            linestyle="--",
        )
    else:
        plt.bar(
            np.array(bit_widths) + width_bar * (i - (len(chunk_sizes) + 1) / 2),
            height=costs,
            width=width_bar,
            label=label,
        )

plt.xticks(bit_widths)
plt.legend(loc="upper left")

plt.show()

You might need to run the script twice to avoid crashing when plotting.

The script will show the following figure:

The default values of these options are set based on simple circuits. How they affect performance will depend on the circuit, so play around with them to get the most out of this extension.

Conversion with the second method (i.e., using chunks) only works in Native encoding, which is usually selected when all table lookups in the circuit are below or equal to 8 bits.

fhe.if_then_else(condition, x, y)

Allows you to perform ternary if operation, with the same semantic as x if condition else y:

import numpy as np
from concrete import fhe

@fhe.compiler({"condition": "encrypted", "x": "encrypted", "y": "encrypted"})
def f(condition, x, y):
    return fhe.if_then_else(condition, x, y)

inputset = [
    (
        np.random.randint(0, 2**1),
        np.random.randint(0, 2**5),
        np.random.randint(-2**3, 2**3),
    )
    for _ in range(10)
]
circuit = f.compile(inputset)

assert circuit.encrypt_run_decrypt(1, 3, 5) == 3
assert circuit.encrypt_run_decrypt(0, 3, 5) == 5
assert circuit.encrypt_run_decrypt(1, 3, -5) == 3
assert circuit.encrypt_run_decrypt(0, 3, -5) == -5

fhe.if_then_else is just an alias for np.where.

fhe.identity(value)

Allows you to copy the value:

import numpy as np
from concrete import fhe

@fhe.compiler({"x": "encrypted"})
def f(x):
    return fhe.identity(x)

inputset = [np.random.randint(-10, 10) for _ in range(10)]
circuit = f.compile(inputset)

assert circuit.encrypt_run_decrypt(0) == 0
assert circuit.encrypt_run_decrypt(1) == 1
assert circuit.encrypt_run_decrypt(-1) == -1
assert circuit.encrypt_run_decrypt(-3) == -3
assert circuit.encrypt_run_decrypt(5) == 5

Identity extension only works in Native encoding, which is usually selected when all table lookups in the circuit are below or equal to 8 bits.

fhe.inputset(...)

Used for creating a random inputset with the given specifications:

inputset = fhe.inputset(fhe.uint4, fhe.tensor[fhe.int3, 3, 2], lambda index: custom_value(index))
assert isinstance(inputset, list)
assert all(isinstance(sample, tuple) and len(sample) == 3 for sample in inputset)

The result will have 100 inputs by default which can be customized using the size keyword argument:

inputset = fhe.inputset(fhe.uint4, fhe.uint4, size=10)
assert len(inputset) == 10

Common tips

Minimum for Two values

In this first example, we compute a minimum by creating the difference between two numbers y and x and conditionally remove this diff from y to either get x if y>x or y if x>y:

Maximum for Two values

The companion example of above with the maximum value of two integers instead of the minimum:

import numpy as np
from concrete import fhe

@fhe.compiler({"x": "encrypted", "y": "encrypted"})
def max_two(x, y):
	diff = y - x
	max_x_y = y - np.minimum(y - x, 0)
	return max_x_y

inputset = [tuple(np.random.randint(0, 16, size=2)) for _ in range(50)]
circuit = max_two.compile(inputset)

x, y = np.random.randint(0, 16, size=2)
assert circuit.encrypt_run_decrypt(x, y) == max(x, y)

Minimum for several values

And an extension for more than two values:

import numpy as np
from concrete import fhe

@fhe.compiler({"args": "encrypted"})
def fhe_min(args):
    remaining = list(args)
    while len(remaining) > 1:
        a = remaining.pop()
        b = remaining.pop()
        min_a_b = b - np.maximum(b - a, 0)
        remaining.insert(0, min_a_b)
    return remaining[0]

inputset = [np.random.randint(0, 16, size=5) for _ in range(50)]
circuit = fhe_min.compile(inputset)

x1, x2, x3, x4, x5 = np.random.randint(0, 16, size=5)
assert circuit.encrypt_run_decrypt([x1, x2, x3, x4, x5]) == min(x1, x2, x3, x4, x5)

Retrieving a value within an encrypted array with an encrypted index

import numpy as np
from concrete import fhe

@fhe.compiler({"array": "encrypted", "index": "encrypted"})
def indexed_value(array, index):
    all_indices = np.arange(array.size)
    index_selection = index == all_indices
    selection_and_zeros = array * index_selection
    selection = np.sum(selection_and_zeros)
    return selection

inputset = [(np.random.randint(0, 16, size=5), np.random.randint(0, 5)) for _ in range(50)]
circuit = indexed_value.compile(inputset)

array = np.random.randint(0, 16, size=5)

index = np.random.randint(0, 5)
assert circuit.encrypt_run_decrypt(array, index) == array[index]

Filter an array with comparison (>)

import numpy as np
from concrete import fhe

@fhe.compiler({"numbers": "encrypted", "threshold": "encrypted"})
def filtering(numbers, threshold):
    is_greater = numbers > threshold

    shifted_numbers = numbers * 2  # open space for a single bit at the end
    combined_numbers_and_is_greater = shifted_numbers + is_greater  # put is_greater to that bit

    def extract(combination):
        is_greater = (combination % 2) == 1  # extract is_greater back from packing
        if_true = combination // 2  # if is greater is true, we unpack the number and use it
        if_false = 0  # otherwise we set the element to zero
        return np.where(is_greater, if_true, if_false)  # and apply the operation

    return fhe.univariate(extract)(combined_numbers_and_is_greater)

inputset = [(np.random.randint(0, 16, size=5), np.random.randint(0, 16)) for _ in range(50)]
circuit = filtering.compile(inputset)

numbers = np.random.randint(0, 16, size=5)
threshold = np.random.randint(0, 16)
assert np.array_equal(circuit.encrypt_run_decrypt(numbers, threshold), list(map(lambda x: x if x > threshold else 0, numbers)))

Matrix Row/Col means

import numpy as np
from concrete import fhe

def smallest_prime_divisor(n):
    if n % 2 == 0:
        return 2

    for i in range(3, int(np.sqrt(n)) + 1):
        if n % i == 0:
            return i

    return n

def mean_of_vector(x):
    assert x.size != 0
    if x.size == 1:
        return x[0]

    group_size = smallest_prime_divisor(x.size)
    if x.size == group_size:
        return np.round(np.sum(x) / x.size).astype(np.int64)

    groups = []
    for i in range(x.size // group_size):
        start = i * group_size
        end = start + group_size
        groups.append(x[start:end])

    mean_of_groups = []
    for group in groups:
        mean_of_groups.append(np.round(np.sum(group) / group_size).astype(np.int64))

    return mean_of_vector(fhe.array(mean_of_groups))

@fhe.compiler(({"x": "encrypted"}))
def mean_of_matrix(x):
    return mean_of_vector(x.flatten())

@fhe.compiler(({"x": "encrypted"}))
def mean_of_rows_of_matrix(x):
    means = []
    for i in range(x.shape[0]):
        means.append(mean_of_vector(x[i]))
    return fhe.array(means)

@fhe.compiler(({"x": "encrypted"}))
def mean_of_columns_of_matrix(x):
    means = []
    for i in range(x.shape[1]):
        means.append(mean_of_vector(x[:, i]))
    return fhe.array(means)


inputset = [np.random.randint(0, 16, size=(5,5)) for _ in range(50)]
matrix = np.random.randint(0, 16, size=(5, 5))

circuit = mean_of_matrix.compile(inputset)
assert circuit.encrypt_run_decrypt(matrix) == round(matrix.mean())

circuit = mean_of_rows_of_matrix.compile(inputset)
assert np.array_equal(circuit.encrypt_run_decrypt(matrix), [round(x) for x in matrix.mean(1)])

circuit = mean_of_columns_of_matrix.compile(inputset)
assert np.array_equal(circuit.encrypt_run_decrypt(matrix), [round(x) for x in matrix.mean(0)])

Bit extraction

Some applications require directly manipulating bits of integers. Concrete provides a bit extraction operation for such applications.

Bit extraction only works in the Native encoding, which is usually selected when all table lookups in the circuit are less than or equal to 8 bits.

Slices can be used for indexing fhe.bits(value) as well.

Even slices with negative steps are supported!

from concrete import fhe

@fhe.compiler({"x": "encrypted"})
def f(x):
    return fhe.bits(x)[3:0:-1]

inputset = range(32)
circuit = f.compile(inputset)

assert circuit.encrypt_run_decrypt(0b_01101) == 0b_011
assert circuit.encrypt_run_decrypt(0b_01011) == 0b_101

Signed integers are supported as well.

from concrete import fhe

@fhe.compiler({"x": "encrypted"})
def f(x):
    return fhe.bits(x)[1:3]

inputset = range(-16, 16)
circuit = f.compile(inputset)

assert circuit.encrypt_run_decrypt(-14) == 0b_01  # -14 == 0b_10010 (in two's complement)
assert circuit.encrypt_run_decrypt(-12) == 0b_10  # -12 == 0b_10100 (in two's complement)

Lastly, here is a practical use case of bit extraction.

import numpy as np
from concrete import fhe

@fhe.compiler({"x": "encrypted"})
def is_even(x):
    return 1 - fhe.bits(x)[0]

inputset = [
    np.random.randint(-16, 16, size=(5,))
    for _ in range(100)
]
circuit = is_even.compile(inputset)

sample = np.random.randint(-16, 16, size=(5,))
for value, value_is_even in zip(sample, circuit.encrypt_run_decrypt(sample)):
    print(f"{value} is {'even' if value_is_even else 'odd'}")

prints

13 is odd
0 is even
-15 is odd
2 is even
-6 is even

Limitations

Bits cannot be extracted using a negative index.
- Which means fhe.bits(x)[-1] or fhe.bits(x)[-4:-1] is not supported for example.
- The reason for this is that we don't know in advance (i.e., before inputset evaluation) how many bits x has.
  - For example, let's say you have x == 10 == 0b_000...0001010, and you want to do fhe.bits(x)[-1]. If the value is 4-bits (i.e., 0b_1010), the result needs to be 1, but if it's 6-bits (i.e., 0b_001010), the result needs to be 0. Since we don't know the bit-width of x before inputset evaluation, we cannot calculate fhe.bits(x)[-1].
When extracting bits using slices in reverse order (i.e., step < 0), the start bit needs to be provided explicitly.
- Which means fhe.bits(x)[::-1] or fhe.bits(x)[:2:-1] is not supported for example.
- The reason is the same as above.
When extracting bits of signed values using slices, the stop bit needs to be provided explicitly.
- Which means fhe.bits(x)[1:] or fhe.bits(x)[1::2] is not supported for example.
- The reason is similar to above.
  - To explain a bit more, signed integers use representation. In this representation, negative values have their most significant bits set to 1 (e.g., -1 == 0b_11111, -2 == 0b_11110, -3 == 0b_11101). Extracting bits always returns a positive value (e.g., fhe.bits(-1)[1:3] == 0b_11 == 3) This means if you were to do fhe.bits(x)[1:] where x == -1, if x is 4 bits, the result would be 0b_111 == 7, but if x is 5 bits the result would be 0b_1111 == 15. Since we don't know the bit-width of x before inputset evaluation, we cannot calculate fhe.bits(x)[1:].
Bits of floats cannot be extracted.
- Floats are partially supported but extracting their bits is not supported at all.

Performance Considerations

A Chain of Individual Bit Extractions

Implications:

Bits are extracted sequentially, starting from the least significant bit to the more significant ones. The cost is proportional to the index of the highest extracted bit plus one.
No parallelization is possible. The computation time is proportional to the cost, independent of the number of CPUs.

Examples:

Extracting fhe.bits(x)[4] is approximately five times costlier than extracting fhe.bits(x)[0].
Extracting fhe.bits(x)[4] takes around five times more wall clock time than fhe.bits(x)[0].
The cost of extracting fhe.bits(x)[0:5] is almost the same as that of fhe.bits(x)[5].

Reuse of Intermediate Extracted Bits

Key Concept: Common sub-expression elimination is applied to intermediate extracted bits.

Implications:

The overall cost for a series of fhe.bits(x)[m:n] calls on the same input x is almost equivalent to the cost of the single most computationally expensive extraction in the series, i.e. fhe.bits(x)[n].
The order of extraction in that series does not affect the overall cost.

Example:

The combined operation fhe.bit(x)[3] + fhe.bit(x)[2] + fhe.bit(x)[1] has almost the same cost as fhe.bits(x)[3].

TLUs of 1b input precision

Each extracted bit incurs a cost of approximately one TLU of 1-bit input precision. Therefore, fhe.bits(x)[0] is generally faster than any other TLU operation.