Only this pageAll pages
Powered by GitBook
1 of 91

2.8

Loading...

Get Started

Loading...

Loading...

Loading...

Loading...

Loading...

Operations

Loading...

Loading...

Other operations

Loading...

Loading...

Loading...

Compilation

Loading...

Loading...

Loading...

Key-related options for faster execution

Loading...

Loading...

Loading...

Loading...

Loading...

Execution / Analysis

Loading...

Loading...

Loading...

Loading...

Other

Loading...

Loading...

Loading...

Guides

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Tutorials

Loading...

References

Loading...

Loading...

Explanations

Loading...

Advanced features

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Developers

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Terminology

This document provides clear definitions of key concepts used in Concrete framework.

Computation graph

A data structure to represent a computation. It takes the form of a directed acyclic graph where nodes represent inputs, constants, or operations.

Tracing

A method that takes a Python function provided by the user and generates a corresponding computation graph.

Bounds

The minimum and the maximum value that each node in the computation graph can take. Bounds are used to determine the appropriate data type (for example, uint3 or int5) for each node before the computation graphs are converted to MLIR. Concrete simulates the graph with the inputs in the inputset to record the minimum and the maximum value for each node.

Circuit

The result of compilation. A circuit includes both client and server components. It has methods for various operations, such as printing and evaluation.

Table Lookup (TLU)

Programmable Bootstrapping (PBS)

TFHE

TLU stands for instructions in the form of y = T[i]. In FHE, this operation is performed with Programmable Bootstrapping, which is the equivalent operation on encrypted values. To learn more about TLU, refer to the and tge section.

PBS is equivalent to table lookup y = T[i] on encrypted values, which means that the inputs i and the outputs y are encrypted, but the table T is not encrypted. You can find a more detailed explanation in the .

TFHE is a Fully Homomorphic Encryption (FHE) scheme that allows you to perform computations over encrypted data. For in-depth explanation of the TFHE scheme, read our blog post series .

Table Lookup basic
Table Lookup advanced
TFHE Deep Dive

Welcome

Concrete is an open-source FHE Compiler that simplifies the use of Fully Homomorphic Encryption (FHE).

Get started

Learn the basics of Concrete, set it up, and make it run with ease.

Build with Concrete

Start building with Concrete by exploring its core features, discovering essential guides, and learning more with step-by-step tutorials.

Explore more

Access to additional resources and join the Zama community.

Explanations

Refer to the API, review product architecture, and access additional resources for in-depth explanations while working with Concrete.

Support channels

Ask technical questions and discuss with the community. Our team of experts usually answers within 24 hours in working days.

Developers

Collaborate with us to advance the FHE spaces and drive innovation together.


Zama 5-Question Developer Survey

Quick overview

In this document, we give a quick overview of the philosophy behind Concrete.

Functions

Available FHE-friendly functions

Levelled vs non-levelled operations

Basically, in the compiled circuit, there will be two kind of operations:

  • levelled operations, which are the additions, subtractions or multiplications by a constant; these operations are also called the linear operations

  • Table Lookup (TLU) operations, which are used to do anything which is not linear.

TLU are more costly that levelled operations, so we also explain how to limit their impact.

Remark that matrix multiplication (aka Gemm -- General Matrix multiplication) and convolutions are levelled operations, since they imply only additions and multiplications by constant.

Conditional branches and loops

Data

Integers

In Concrete, everything needs to be an integer. Users needing floats can quantize to integers before encryption, operate on integers and dequantize to floats after decryption: all of this is done for the user in Concrete ML. However, you can have floating-point intermediate values as long as they can be converted to an integer Table Lookup, for example, (60 * np.sin(x)).astype(np.int64).

Scalars and tensors

Functions can use scalar and tensors. As with Python, it is prefered to use tensorization, to make computations faster.

Inputs

Inputs of a compiled function can be either encrypted or clear. Use of clear inputs is however quite limited. Remark that constants can appear in the program without much constraints, they are different from clear inputs which are dynamic.

Bit width constraints

Bit width of encrypted values has a limit. We are constantly working on increasing the bit width limit. Exceeding this limit will trigger an error.

What is Concrete?

Concrete is an open source framework that simplifies the use of Fully Homomorphic Encryption (FHE).

FHE is a powerful technology that enables computations on encrypted data without needing to decrypt it. This capability ensures user privacy and provides robust protection against data breaches, as operations are performed on encrypted data, keeping sensitive information secure even if the server is compromised.

We want to hear from you! Take 1 minute to share your thoughts and helping us enhance our documentation and libraries. 👉 to participate.

Concrete is a compiler, which aims to turn Python code into its FHE equivalent, in a process which is called the FHE compilation. The best efforts were made to simplify the process: in particular, exceptions apart, the same functions than the Python users are used to use are available. More complete list of available functions is given .

TLU operations are essential to be able to compile complex functions. We explain their use in different sections of the documentation: or . We have tools in Concrete to replace univariate or multivariate non-linear functions (ie, functions of one or more inputs) by TLU.

Functions can't use conditional branches or non-constant-size loops, unless are used. However, control flow statements with constant values are allowed, for example, for i in range(SOME_CONSTANT), if os.environ.get("SOME_FEATURE") == "ON":.

The Concrete framework makes writing FHE programs easy for developers by incorporating a Fully Homomorphic Encryption over the Torus (TFHE) Compiler based on .

Concrete enables developers to efficiently develop privacy-preserving applications for various use cases. For instance, is built on top of Concrete to integrate privacy-preserving features of FHE into machine learning use cases.

API
Frontend fusing
Compiler backend
Optimizer
Community forum
Discord channel
Contribute to Concrete
Check the latest release note
Request a feature
Report a bug
Click here
in the reference section
direct TLU use
internal use to replace some non-linear functions
modules
LLVM
Concrete ML

Non-linear operations

Overview of non-linear operations

In Concrete, there are two types of operations:

  • Linear operations: These include additions, subtractions, and multiplications by an integer. They are computationally fast.

Changing bit width in the MLIR or dynamically with a TLU

Binary operations often require operands to have matching bit widths. This adjustment can be achieved in two ways: either directly within the MLIR or dynamically at execution time using a TLU. Each method has its own advantages and trade-offs, so Concrete provides multiple configuration options for non-linear functions.

MLIR adjustment: This method doesn't require an expensive TLU. However, it may affect other parts of your program if the adjusted operand is used elsewhere, potentially causing more changes.

Dynamic adjustment with TLU: This method is more localized and won’t impact other parts of your program, but it’s more expensive due to the cost of using a TLU.

General guidelines

In the following non-linear operations, we propose a certain number of configurations, using the two methods on the different operands. It’s not always clear which option will be the fastest, so we recommend trying out different configurations to see what works best for your circuit.

Note that you have the option to set show_mlir=True to view how the MLIR handles TLUs and bit width changes. However, it's not essential to understand these details. So we recommend just testing the configurations and pick the one that performs best for your case.

Comparisons

For comparison, there are 7 available methods. Here's the general principle:

import numpy as np
from concrete import fhe

configuration = fhe.Configuration(
    comparison_strategy_preference=config,
)

def f(x, y):
    return x < y

inputset = [
    (np.random.randint(0, 2**4), np.random.randint(0, 2**4))
    for _ in range(100)
]

compiler = fhe.Compiler(f, {"x": "encrypted", "y": "encrypted"})
circuit = compiler.compile(inputset, configuration, show_mlir=True)

The config can be one of the following:

  • fhe.ComparisonStrategy.CHUNKED

  • fhe.ComparisonStrategy.ONE_TLU_PROMOTED

  • fhe.ComparisonStrategy.THREE_TLU_CASTED

  • fhe.ComparisonStrategy.TWO_TLU_BIGGER_PROMOTED_SMALLER_CASTED

  • fhe.ComparisonStrategy.TWO_TLU_BIGGER_CASTED_SMALLER_PROMOTED

  • fhe.ComparisonStrategy.THREE_TLU_BIGGER_CLIPPED_SMALLER_CASTED

  • fhe.ComparisonStrategy.TWO_TLU_BIGGER_CLIPPED_SMALLER_PROMOTED

Min / Max operations

For min / max operations, there are 3 available methods. Here's the general principle:

import numpy as np
from concrete import fhe

configuration = fhe.Configuration(
    min_max_strategy_preference=config,
)

def f(x, y):
    return np.minimum(x, y)

inputset = [
    (np.random.randint(0, 2**4), np.random.randint(0, 2**2))
    for _ in range(100)
]

compiler = fhe.Compiler(f, {"x": "encrypted", "y": "encrypted"})
circuit = compiler.compile(inputset, configuration, show_mlir=True)

The config can be one of the following:

  • fhe.MinMaxStrategy.CHUNKED (default)

  • fhe.MinMaxStrategy.ONE_TLU_PROMOTED

  • fhe.MinMaxStrategy.THREE_TLU_CASTED

Bitwise operations

For bit wise operations (typically, AND, OR, XOR), there are 5 available methods. Here's the general principle:

import numpy as np
from concrete import fhe

configuration = fhe.Configuration(
    bitwise_strategy_preference=config,
)

def f(x, y):
    return x & y

inputset = [
    (np.random.randint(0, 2**4), np.random.randint(0, 2**4))
    for _ in range(100)
]

compiler = fhe.Compiler(f, {"x": "encrypted", "y": "encrypted"})
circuit = compiler.compile(inputset, configuration, show_mlir=True)

The config can be one of the following:

  • fhe.BitwiseStrategy.CHUNKED

  • fhe.BitwiseStrategy.ONE_TLU_PROMOTED

  • fhe.BitwiseStrategy.THREE_TLU_CASTED

  • fhe.BitwiseStrategy.TWO_TLU_BIGGER_PROMOTED_SMALLER_CASTED

  • fhe.BitwiseStrategy.TWO_TLU_BIGGER_CASTED_SMALLER_PROMOTED

Shift operations

For shift operations, there are 2 available methods. Here's the general principle:

import numpy as np
from concrete import fhe

configuration = fhe.Configuration(
    shifts_with_promotion=shifts_with_promotion,
)

def f(x, y):
    return x << y

inputset = [
    (np.random.randint(0, 2**3), np.random.randint(0, 2**2))
    for _ in range(100)
]

compiler = fhe.Compiler(f, {"x": "encrypted", "y": "encrypted"})
circuit = compiler.compile(inputset, configuration, show_mlir=True)

The shifts_with_promotion is either True or False.

Relation with fhe.multivariate

import numpy as np
from concrete import fhe


def f(x, y):
    return fhe.multivariate(lambda x, y: x << y)(x, y)


inputset = [(np.random.randint(0, 2**3), np.random.randint(0, 2**2)) for _ in range(100)]

compiler = fhe.Compiler(f, {"x": "encrypted", "y": "encrypted"})
circuit = compiler.compile(inputset, show_mlir=True)

Extensions

This document introduces some extensions of Concrete, including functions for wrapping univariate and multivariate functions, performing convolution and maxpool operations, creating encrypted arrays, and more.

fhe.univariate(function)

Wraps any univariate function into a single table lookup:

import numpy as np
from concrete import fhe

def complex_univariate_function(x):

    def per_element(element):
        result = 0
        for i in range(element):
            result += i
        return result

    return np.vectorize(per_element)(x)

@fhe.compiler({"x": "encrypted"})
def f(x):
    return fhe.univariate(complex_univariate_function)(x)

inputset = [np.random.randint(0, 5, size=(3, 2)) for _ in range(10)]
circuit = f.compile(inputset)

sample = np.array([
    [0, 4],
    [2, 1],
    [3, 0],
])
assert np.array_equal(circuit.encrypt_run_decrypt(sample), complex_univariate_function(sample))

The wrapped function must follow these criteria:

  • No side effects: For example, no modification of global state

  • Deterministic: For example, no random number generation.

  • Shape consistency: output.shape should be the same with input.shape

  • Element-wise mapping: Each output element must correspond to a single input element, for example. output[0] should only depend on input[0] of all inputs.

Violating these constraints may result in undefined outcome.

fhe.multivariate(function)

Wraps any multivariate function into a table lookup:

import numpy as np
from concrete import fhe

def value_if_condition_else_zero(value, condition):
    return value if condition else np.zeros_like(value, dtype=np.int64)

def function(x, y):
    return fhe.multivariate(value_if_condition_else_zero)(x, y)

inputset = [
    (
        np.random.randint(-2**4, 2**4, size=(2, 2)),
        np.random.randint(0, 2**1, size=()),
    )
    for _ in range(100)
]

compiler = fhe.Compiler(function, {"x": "encrypted", "y": "encrypted"})
circuit = compiler.compile(inputset)

sample = [np.array([[-2, 4], [0, 1]]), 0]
assert np.array_equal(circuit.encrypt_run_decrypt(*sample), function(*sample))

sample = [np.array([[3, -1], [2, 4]]), 1]
assert np.array_equal(circuit.encrypt_run_decrypt(*sample), function(*sample))

The wrapped functions must follow these criteria:

  • No side effects: For example, avoid modifying global state.

  • Deterministic: For example, no random number generation.

  • Broadcastable shapes: input.shape should be broadcastable to output.shape for all inputs.

  • Element-wise mapping: Each output element must correspond to a single input element, for example, output[0] should only depend on input[0] of all inputs.

Violating these constraints may result in undefined outcome.

fhe.conv(...)

import numpy as np
from concrete import fhe

weight = np.array([[2, 1], [3, 2]]).reshape(1, 1, 2, 2)

@fhe.compiler({"x": "encrypted"})
def f(x):
    return fhe.conv(x, weight, strides=(2, 2), dilations=(1, 1), group=1)

inputset = [np.random.randint(0, 4, size=(1, 1, 4, 4)) for _ in range(10)]
circuit = f.compile(inputset)

sample = np.array(
    [
        [3, 2, 1, 0],
        [3, 2, 1, 0],
        [3, 2, 1, 0],
        [3, 2, 1, 0],
    ]
).reshape(1, 1, 4, 4)
assert np.array_equal(circuit.encrypt_run_decrypt(sample), f(sample))

Only 2D convolutions without padding and with one group are currently supported.

fhe.maxpool(...)

import numpy as np
from concrete import fhe

@fhe.compiler({"x": "encrypted"})
def f(x):
    return fhe.maxpool(x, kernel_shape=(2, 2), strides=(2, 2), dilations=(1, 1))

inputset = [np.random.randint(0, 4, size=(1, 1, 4, 4)) for _ in range(10)]
circuit = f.compile(inputset)

sample = np.array(
    [
        [3, 2, 1, 0],
        [3, 2, 1, 0],
        [3, 2, 1, 0],
        [3, 2, 1, 0],
    ]
).reshape(1, 1, 4, 4)
assert np.array_equal(circuit.encrypt_run_decrypt(sample), f(sample))

Only 2D maxpooling without padding and up to 15-bits is currently supported.

fhe.array(...)

Create encrypted arrays:

import numpy as np
from concrete import fhe

@fhe.compiler({"x": "encrypted", "y": "encrypted"})
def f(x, y):
    return fhe.array([x, y])

inputset = [(3, 2), (7, 0), (0, 7), (4, 2)]
circuit = f.compile(inputset)

sample = (3, 4)
assert np.array_equal(circuit.encrypt_run_decrypt(*sample), f(*sample))

Currently, only scalars can be used to create arrays.

fhe.zero()

Create an encrypted scalar zero:

from concrete import fhe
import numpy as np

@fhe.compiler({"x": "encrypted"})
def f(x):
    z = fhe.zero()
    return x + z

inputset = range(10)
circuit = f.compile(inputset)

for x in range(10):
    assert circuit.encrypt_run_decrypt(x) == x

fhe.zeros(shape)

Create an encrypted tensor of zeros:

from concrete import fhe
import numpy as np

@fhe.compiler({"x": "encrypted"})
def f(x):
    z = fhe.zeros((2, 3))
    return x + z

inputset = range(10)
circuit = f.compile(inputset)

for x in range(10):
    assert np.array_equal(circuit.encrypt_run_decrypt(x), np.array([[x, x, x], [x, x, x]]))

fhe.one()

Create an encrypted scalar one:

from concrete import fhe
import numpy as np

@fhe.compiler({"x": "encrypted"})
def f(x):
    z = fhe.one()
    return x + z

inputset = range(10)
circuit = f.compile(inputset)

for x in range(10):
    assert circuit.encrypt_run_decrypt(x) == x + 1

fhe.ones(shape)

Create an encrypted tensor of ones:

from concrete import fhe
import numpy as np

@fhe.compiler({"x": "encrypted"})
def f(x):
    z = fhe.ones((2, 3))
    return x + z

inputset = range(10)
circuit = f.compile(inputset)

for x in range(10):
    assert np.array_equal(circuit.encrypt_run_decrypt(x), np.array([[x, x, x], [x, x, x]]) + 1)

fhe.constant(value)

Allows you to create an encrypted constant of a given value.

from concrete import fhe
import numpy as np

@fhe.compiler({"x": "encrypted", "a":"clear"})
def f(x, a):
    z = fhe.constant(a)
    return x + z

inputset = range(10)
circuit = f.compile(inputset)

for x in range(10):
    assert circuit.encrypt_run_decrypt(x, 5) == x + 5

This extension is also compatible with constant arrays.

fhe.hint(value, **kwargs)

Hint properties of a value. Imagine you have this circuit:

from concrete import fhe
import numpy as np

@fhe.compiler({"x": "encrypted"})
def f(x, y, z):
    a = x | y
    b = y & z
    c = a ^ b
    return c

inputset = [
    (np.random.randint(0, 2**8), np.random.randint(0, 2**8), np.random.randint(0, 2**8))
    for _ in range(3)
]
circuit = f.compile(inputset)

print(circuit)

You'd expect all of a, b, and c to be 8-bits, but because inputset is very small, this code could print:

%0 = x                          # EncryptedScalar<uint8>        ∈ [173, 240]
%1 = y                          # EncryptedScalar<uint8>        ∈ [52, 219]
%2 = z                          # EncryptedScalar<uint8>        ∈ [36, 252]
%3 = bitwise_or(%0, %1)         # EncryptedScalar<uint8>        ∈ [243, 255]
%4 = bitwise_and(%1, %2)        # EncryptedScalar<uint7>        ∈ [0, 112] 
                                                  ^^^^^ this can lead to bugs
%5 = bitwise_xor(%3, %4)        # EncryptedScalar<uint8>        ∈ [131, 255]
return %5

The first solution in these cases should be to use a bigger inputset, but it can still be tricky to solve with the inputset. That's where the hint extension comes into play. Hints are a way to provide extra information to compilation process:

  • Bit-width hints are for constraining the minimum number of bits in the encoded value. If you hint a value to be 8-bits, it means it should be at least uint8 or int8.

To fix f using hints, you can do:

@fhe.compiler({"x": "encrypted", "y": "encrypted", "z": "encrypted"})
def f(x, y, z):
    # hint that inputs should be considered at least 8-bits
    x = fhe.hint(x, bit_width=8)
    y = fhe.hint(y, bit_width=8)
    z = fhe.hint(z, bit_width=8)

    # hint that intermediates should be considered at least 8-bits
    a = fhe.hint(x | y, bit_width=8)
    b = fhe.hint(y & z, bit_width=8)
    c = fhe.hint(a ^ b, bit_width=8)

    return c

Hints are only applied to the value being hinted, and no other value. If you want the hint to be applied to multiple values, you need to hint all of them.

you'll always see:

%0 = x                          # EncryptedScalar<uint8>        ∈ [...]
%1 = y                          # EncryptedScalar<uint8>        ∈ [...]
%2 = z                          # EncryptedScalar<uint8>        ∈ [...]
%3 = bitwise_or(%0, %1)         # EncryptedScalar<uint8>        ∈ [...]
%4 = bitwise_and(%1, %2)        # EncryptedScalar<uint8>        ∈ [...] 
%5 = bitwise_xor(%3, %4)        # EncryptedScalar<uint8>        ∈ [...]
return %5

regardless of the bounds.

Alternatively, you can use it to make sure a value can store certain integers:

@fhe.compiler({"x": "encrypted", "y": "encrypted"})
def is_vectors_same(x, y):
    assert x.ndim != 1
    assert y.ndim != 1
    
    assert len(x) == len(y)
    n = len(x)
    
    number_of_same_elements = np.sum(x == y)
    fhe.hint(number_of_same_elements, can_store=n)  # hint that number of same elements can go up to n
    is_same = number_of_same_elements == n

    return is_same

fhe.relu(value)

Perform ReLU operation, with the same semantic as x if x >= 0 else 0:

import numpy as np
from concrete import fhe

@fhe.compiler({"x": "encrypted"})
def f(x):
    return fhe.relu(x)

inputset = [np.random.randint(-10, 10) for _ in range(10)]
circuit = f.compile(inputset)

assert circuit.encrypt_run_decrypt(0) == 0
assert circuit.encrypt_run_decrypt(1) == 1
assert circuit.encrypt_run_decrypt(-1) == 0
assert circuit.encrypt_run_decrypt(-3) == 0
assert circuit.encrypt_run_decrypt(5) == 5

ReLU Conversion methods

The ReLU operation can be implemented in two ways:

  • Single TLU (Table Lookup Unit) on the original bit-width: Suitable for small bit-widths, as it requires fewer resources.

  • Multiple TLUs on smaller bit-widths: Better for large bit-widths, avoiding the high cost of a single large TLU.

Configuration options

The method of conversion is controlled by the relu_on_bits_threshold: int = 7 option. For example, setting relu_on_bits_threshold=5 means:

  • Bit-widths from 1 to 4 will use a single TLU.

  • Bit-widths of 5 and above will use multiple TLUs.

Here is a script showing how execution cost is impacted when changing these values:

from concrete import fhe
import numpy as np
import matplotlib.pyplot as plt

chunk_sizes = np.array(range(1, 6), dtype=int)
bit_widths = np.array(range(5, 17), dtype=int)

data = []
for bit_width in bit_widths:
    title = f"{bit_width=}:"
    print(title)
    print("-" * len(title))

    inputset = range(-2**(bit_width-1), 2**(bit_width-1))
    configuration = fhe.Configuration(relu_on_bits_threshold=17)

    compiler = fhe.Compiler(lambda x: fhe.relu((fhe.relu(x) - (2**(bit_width-2))) * 2), {"x": "encrypted"})
    circuit = compiler.compile(inputset, configuration)

    print(f"    Complexity: {circuit.complexity} # tlu")
    data.append((bit_width, 0, circuit.complexity))

    for chunk_size in chunk_sizes:
        configuration = fhe.Configuration(
            relu_on_bits_threshold=1,
            relu_on_bits_chunk_size=int(chunk_size),
        )
        circuit = compiler.compile(inputset, configuration)

        print(f"    Complexity: {circuit.complexity} # {chunk_size=}")
        data.append((bit_width, chunk_size, circuit.complexity))

    print()

data = np.array(data)

plt.title(f"ReLU using TLU vs using bits")
plt.xlabel("Input/Output precision")
plt.ylabel("Cost")

for i, chunk_size in enumerate([0] + list(chunk_sizes)):
    costs = [
        cost
        for _, candidate_chunk_size, cost in data
        if candidate_chunk_size == chunk_size
    ]
    assert len(costs) == len(bit_widths)

    label = "Single TLU" if i == 0 else f"Bits extract + multiples {chunk_size + 1} bits TLUs"
    width_bar = 0.8 / (len(chunk_sizes) + 1)

    if i == 0:
        plt.hlines(
            costs,
            bit_widths - 0.45,
            bit_widths + 0.45,
            label=label,
            linestyle="--",
        )
    else:
        plt.bar(
            np.array(bit_widths) + width_bar * (i - (len(chunk_sizes) + 1) / 2),
            height=costs,
            width=width_bar,
            label=label,
        )

plt.xticks(bit_widths)
plt.legend(loc="upper left")

plt.show()

You might need to run the script twice to avoid crashing when plotting.

The script will show the following figure:

The default values of these options are set based on simple circuits. How they affect performance will depend on the circuit, so play around with them to get the most out of this extension.

Conversion with the second method (using chunks) only works in Native encoding, which is usually selected when all table lookups in the circuit are below or equal to 8 bits.

fhe.if_then_else(condition, x, y)

Perform ternary if operation, with the same semantic as x if condition else y:

import numpy as np
from concrete import fhe

@fhe.compiler({"condition": "encrypted", "x": "encrypted", "y": "encrypted"})
def f(condition, x, y):
    return fhe.if_then_else(condition, x, y)

inputset = [
    (
        np.random.randint(0, 2**1),
        np.random.randint(0, 2**5),
        np.random.randint(-2**3, 2**3),
    )
    for _ in range(10)
]
circuit = f.compile(inputset)

assert circuit.encrypt_run_decrypt(1, 3, 5) == 3
assert circuit.encrypt_run_decrypt(0, 3, 5) == 5
assert circuit.encrypt_run_decrypt(1, 3, -5) == 3
assert circuit.encrypt_run_decrypt(0, 3, -5) == -5

fhe.identity(value)

Copy the value:

import numpy as np
from concrete import fhe

@fhe.compiler({"x": "encrypted"})
def f(x):
    return fhe.identity(x)

inputset = [np.random.randint(-10, 10) for _ in range(10)]
circuit = f.compile(inputset)

assert circuit.encrypt_run_decrypt(0) == 0
assert circuit.encrypt_run_decrypt(1) == 1
assert circuit.encrypt_run_decrypt(-1) == -1
assert circuit.encrypt_run_decrypt(-3) == -3
assert circuit.encrypt_run_decrypt(5) == 5

The fhe.identity extension is useful for cloning an input with a different bit-width.

Identity extension only works in Native encoding, which is usually selected when all table lookups in the circuit are below or equal to 8 bits.

fhe.refresh(value)

It is similar to fhe.identity but with the extra guarantee that encryption noise is refreshed.

Refresh is useful when you want to control precisely where encryption noise is refreshed in your circuit. For instance if your are using modules, sometimes compilation rejects the module because it's not composable. This happens because a function of the module never refresh the encryption noise. Adding a return fhe.refresh(result) on the function result solves the issue.

Refresh extension only works in Native encoding, which is usually selected when all table lookups in the circuit are below or equal to 8 bits.

fhe.inputset(...)

Create a random inputset with the given specifications:

inputset = fhe.inputset(fhe.uint4, fhe.tensor[fhe.int3, 3, 2], lambda index: custom_value(index))
assert isinstance(inputset, list)
assert all(isinstance(sample, tuple) and len(sample) == 3 for sample in inputset)

The result will have 100 inputs by default which can be customized using the size keyword argument:

inputset = fhe.inputset(fhe.uint4, fhe.uint4, size=10)
assert len(inputset) == 10

Common tips

Retrieving a value within an encrypted array with an encrypted index

This example demonstrates how to retrieve a value from an array using an encrypted index. The method creates a "selection" array filled with 0s except for the requested index, which will be 1. It then sums the products of all array values with this selection array:

Filter an array with comparison (>)

This example filters an encrypted array with an encrypted condition, in this case a greater than comparison with an encrypted value. It packs all values with a selection bit that results from the comparison, allowing the unpacking of only the filtered values:

Matrix Row/Col means

This example introduces a key concept when using Concrete: maximizing parallelization. Instead of sequentially summing all values to compute a mean, the values are split into sub-groups, and the mean of these sub-group means is computed:

Table Lookups basics

In TFHE, there exists mainly two operations: the linear operations, such as additions, subtractions, multiplications by integer, and the non-linear operations. Non-linear operations are achieved with Table Lookups (TLUs).

Performance

When using TLUs in Concrete, the most crucial factor for speed is the bit-width of the TLU. The smaller the bit width, the faster the corresponding FHE operation. Therefore, you should reduce the size of inputs to the lookup tables whenever possible. At the end of this document, we discuss methods for truncating or rounding entries to decrease the effective input size, further improving TLU performance.

Direct TLU

A direct TLU performs operations in the form of y = T[i], where T is a table and i is an index. You can define the table using fhe.LookupTable and apply it to scalars or tensors.

Scalar lookup

Tensor lookup

The LookupTable behaves like Python's array indexing, where negative indices access elements from the end of the table.

Multi TLU

A multi TLU is used to apply different elements of the input to different tables (e.g., square the first column, cube the second column):

Transparent TLU

In many cases, you won't need to define your own TLUs, as Concrete will set them for you.

Note that this kind of TLU is compatible with the TLU options, particularly with rounding and truncating which are explained below.

Optimizing input size

Truncating

The first option is to set i' as the truncation of i. In this method, we just take the most significant bits of i. This is done with fhe.truncate_bit_pattern.

Rounding

The second option is to set i as the rounded value of i. In this method, we take the most significant bits of i and round up by 1 if the most significant ignored bit is 1. This is done with fhe.round_bit_pattern.

However, this approach can be slightly more complex, as rounding might result in an index that exceeds the original table's bounds. To handle this, we expand the original table by one additional index:

Approximate rounding

For further optimizations, the fhe.round_bit_pattern function has an exactness=fhe.Exactness.APPROXIMATE option, which allows for faster computations at the cost of minor differences between cleartext and encrypted results:

Zama 5-Question Developer Survey

With modules

This document explains how to compile Fully Homomorphic Encryption (FHE) modules containing multiple functions using Concrete.

Deploying a server that contains many compatible functions is important for some use cases. With Concrete, you can compile FHE modules containing as many functions as needed.

Single inputs / outputs

The following example demonstrates how to create an FHE module:

Then, to compile the Counter module, use the compile method with a dictionary of input-sets for each function:

After the module is compiled, you can encrypt and call the different functions as follows:

You can generate the keyset beforehand by calling keygen() method on the compiled module:

Multi inputs / outputs

Composition is not limited to single input / single output. Here is an example that computes the 10 first elements of the Fibonacci sequence in FHE:

Executing this script will provide the following output:

Iterations

Modules support iteration with cleartext iterands to some extent, particularly for loops structured like this:

This script prints the following output:

In this example, a while loop iterates until the decrypted value equals 1. The loop body is implemented in FHE, but the iteration control must be in cleartext.

Runtime optimization

By default, when using modules, all inputs and outputs of every function are compatible, sharing the same precision and crypto-parameters. This approach applies the crypto-parameters of the most costly code path to all code paths. This simplicity may be costly and unnecessary for some use cases.

To optimize runtime, we provide finer-grained control over the composition policy via the composition module attribute. Here is an example:

You have 3 options for the composition attribute:

  1. fhe.AllComposable (default): This policy ensures that all ciphertexts used in the module are compatible. It is the least restrictive policy but the most costly in terms of performance.

  2. fhe.NotComposable: This policy is the most restrictive but the least costly. It is suitable when you do not need any composition and only want to pack multiple functions in a single artifact.

  3. fhe.Wired: This policy allows you to define custom composition rules. You can specify which outputs of a function can be forwarded to which inputs of another function.

Note that, in case of complex composition logic another option is to rely on [[composing_functions_with_modules#Automatic module tracing]] to automatically derive the composition from examples.

In this case, the policy states that the first output of the collatz function can be forwarded to the first input of collatz, but not the second output (which is decrypted every time, and used for control flow).

You can use the fhe.Wire between any two functions. It is also possible to define wires with fhe.AllInputs and fhe.AllOutputs ends. For instance, in the previous example:

This policy would be equivalent to using the fhe.AllComposable policy.

Automatic module tracing

When a module's composition logic is static and straightforward, declaratively defining a Wired policy is usually the simplest approach. However, in cases where modules have more complex or dynamic composition logic, deriving an accurate list of Wire components to be used in the policy can become challenging.

Another related problem is defining different function input-sets. When the composition logic is simple, these can be provided manually. But as the composition gets more convoluted, computing a consistent ensemble of inputsets for a module may become intractable.

For those advanced cases, you can derive the composition rules and the input-sets automatically from user-provided examples. Consider the following module:

You can use the wire_pipeline context manager to activate the module tracing functionality:

Note that any dynamic branching is possible during module tracing. However, for complex runtime logic, ensure that the input set provides sufficient examples to cover all potential code paths.

Current Limitations

Depending on the functions, composition may add a significant overhead compared to a non-composable version.

To be composable, a function must meet the following condition: every output that can be forwarded as input (according to the composition policy) must contain a noise-refreshing operation. Since adding a noise refresh has a noticeable impact on performance, Concrete does not automatically include it.

For instance, to implement a function that doubles an encrypted value, you might write:

This function is valid with the fhe.NotComposable policy. However, if compiled with the fhe.AllComposable policy, it will raise a RuntimeError: Program cannot be composed: ..., indicating that an extra Programmable Bootstrapping (PBS) step must be added.

To resolve this and make the circuit valid, add a PBS at the end of the circuit:

Combining compiled functions

This document explains how to combine compiled functions in Concrete, focusing on scenarios where multiple functions need to work together seamlessly. The goal is to ensure that outputs from certain functions can be used as inputs for others without decryption, including in recursive functions.

Concrete offers two methods to achieve this:

With composition

This document explains how to combine compiled functions with the composable flag in Concrete.

By setting the composable flag to True, you can compile a function such that its outputs can be reused as inputs. For example, you can then easily compute f(f(x)) or f**i(x) = f(f(...(f(x) ..)) for a non-encrypted integer i variable, which is usually required for recursions.

Here is an example:

This document introduces the usages and optimization strategies of non-linear operations in Concrete, focusing on comparisons, min/max operations, bitwise operations, and shifts. For a more in-depth explanation on advanced options, refer to the .

Non-linear operations: These require to maintain the semantic integrity of the user's program. The performance of TLUs is slower and vary depending on the bit width of the inputs.

All binary operations described in this document can also be implemented with the fhe.multivariate function which is described in . Here's an example:

Multivariate functions cannot be called with inputs.

Perform a convolution operation, with the same semantic as :

Perform a maxpool operation, with the same semantic as :

Another option to fine-tune the implementation is relu_on_bits_chunk_size: int = 2. For example, setting relu_on_bits_chunk_size=4 means that when using second implementation (using chunks), the input is split to 4-bit chunks using , and then the ReLU is applied to those chunks, which are then combined back.

fhe.if_then_else is just an alias for .

This document introduces several common techniques for optimizing code to fit Fully Homomorphic Encryption (FHE) . The examples provided demonstrate various workarounds and performance optimizations that you can implement while working with the Concrete library.

All code snippets provided here are temporary workarounds. In future versions of Concrete, some functions described here could be directly available in a more generic and efficient form. These code snippets are coming from support answers in our

This document introduces the concept of Table Lookups (TLUs) in Concrete, covers the basic TLU usage, performance considerations, and some basic techniques for optimizing TLUs in encrypted computations. For more advanced TLU usage, refer to the section

and extensions are convenient ways to perform more complex operations as transparent TLUs.

Reducing the bit size of TLU inputs is essential for execution efficiency, as mentioned in the previous section. One effective method is to replace the table lookup y = T[i] by some y = T'[i'], where i' only has the most significant bits of i and T' is a much shorter table. This approach can significantly speed up the TLU while maintaining acceptable accuracy in many applications, such as machine learning.

In this section, we introduce two basic techniques: or . You can find more in-depth explanation and other advanced techniques of optimization in the .

We want to hear from you! Take 1 minute to share your thoughts and helping us enhance our documentation and libraries. 👉 to participate.

These modules support the composition of different functions, meaning that the encrypted result of one function can be used as the input for another function without needing to decrypt it first. Additionally, a module is , making it as simple to use as a single-function project.

Unbounded loops or complex dynamic conditions are also supported, as long as these conditions are computed in pure cleartext in Python. The following example computes the :

Using the composable flag: This method is suitable when there is a single function. The composable flag allows the function to be compiled in a way that its output can be used as input for subsequent operations. For more details, refer to the .

Using Concrete modules: This method is ideal when dealing with multiple functions or when more control is needed over how outputs are reused as inputs. Concrete modules allow you to specify precisely how functions interact. For further information, see the .

Remark that this option is the equivalent to using the fhe.AllComposable policy of . In particular, the same limitations may occur (see section).

Table Lookup advanced documentation
Table Lookups (TLUs)
rounded
onnx.Conv
onnx.MaxPool
fhe.bits
np.where
fhe.multivariate function documentation
import numpy as np
from concrete import fhe

@fhe.compiler({"array": "encrypted", "index": "encrypted"})
def indexed_value(array, index):
    all_indices = np.arange(array.size)
    index_selection = index == all_indices
    selection_and_zeros = array * index_selection
    selection = np.sum(selection_and_zeros)
    return selection

inputset = [(np.random.randint(0, 16, size=5), np.random.randint(0, 5)) for _ in range(50)]
circuit = indexed_value.compile(inputset)

array = np.random.randint(0, 16, size=5)

index = np.random.randint(0, 5)
assert circuit.encrypt_run_decrypt(array, index) == array[index]
import numpy as np
from concrete import fhe

@fhe.compiler({"numbers": "encrypted", "threshold": "encrypted"})
def filtering(numbers, threshold):
    is_greater = numbers > threshold

    shifted_numbers = numbers * 2  # open space for a single bit at the end
    combined_numbers_and_is_greater = shifted_numbers + is_greater  # put is_greater to that bit

    def extract(combination):
        is_greater = (combination % 2) == 1  # extract is_greater back from packing
        if_true = combination // 2  # if is greater is true, we unpack the number and use it
        if_false = 0  # otherwise we set the element to zero
        return np.where(is_greater, if_true, if_false)  # and apply the operation

    return fhe.univariate(extract)(combined_numbers_and_is_greater)

inputset = [(np.random.randint(0, 16, size=5), np.random.randint(0, 16)) for _ in range(50)]
circuit = filtering.compile(inputset)

numbers = np.random.randint(0, 16, size=5)
threshold = np.random.randint(0, 16)
assert np.array_equal(circuit.encrypt_run_decrypt(numbers, threshold), list(map(lambda x: x if x > threshold else 0, numbers)))
import numpy as np
from concrete import fhe

def smallest_prime_divisor(n):
    if n % 2 == 0:
        return 2

    for i in range(3, int(np.sqrt(n)) + 1):
        if n % i == 0:
            return i

    return n

def mean_of_vector(x):
    assert x.size != 0
    if x.size == 1:
        return x[0]

    group_size = smallest_prime_divisor(x.size)
    if x.size == group_size:
        return np.round(np.sum(x) / x.size).astype(np.int64)

    groups = []
    for i in range(x.size // group_size):
        start = i * group_size
        end = start + group_size
        groups.append(x[start:end])

    mean_of_groups = []
    for group in groups:
        mean_of_groups.append(np.round(np.sum(group) / group_size).astype(np.int64))

    return mean_of_vector(fhe.array(mean_of_groups))

@fhe.compiler(({"x": "encrypted"}))
def mean_of_matrix(x):
    return mean_of_vector(x.flatten())

@fhe.compiler(({"x": "encrypted"}))
def mean_of_rows_of_matrix(x):
    means = []
    for i in range(x.shape[0]):
        means.append(mean_of_vector(x[i]))
    return fhe.array(means)

@fhe.compiler(({"x": "encrypted"}))
def mean_of_columns_of_matrix(x):
    means = []
    for i in range(x.shape[1]):
        means.append(mean_of_vector(x[:, i]))
    return fhe.array(means)


inputset = [np.random.randint(0, 16, size=(5,5)) for _ in range(50)]
matrix = np.random.randint(0, 16, size=(5, 5))

circuit = mean_of_matrix.compile(inputset)
assert circuit.encrypt_run_decrypt(matrix) == round(matrix.mean())

circuit = mean_of_rows_of_matrix.compile(inputset)
assert np.array_equal(circuit.encrypt_run_decrypt(matrix), [round(x) for x in matrix.mean(1)])

circuit = mean_of_columns_of_matrix.compile(inputset)
assert np.array_equal(circuit.encrypt_run_decrypt(matrix), [round(x) for x in matrix.mean(0)])
from concrete import fhe

table = fhe.LookupTable([2, -1, 3, 0])

@fhe.compiler({"x": "encrypted"})
def f(x):
    return table[x]

inputset = range(4)
circuit = f.compile(inputset)

assert circuit.encrypt_run_decrypt(0) == table[0] == 2
assert circuit.encrypt_run_decrypt(1) == table[1] == -1
assert circuit.encrypt_run_decrypt(2) == table[2] == 3
assert circuit.encrypt_run_decrypt(3) == table[3] == 0
from concrete import fhe
import numpy as np

table = fhe.LookupTable([2, -1, 3, 0])


@fhe.compiler({"x": "encrypted"})
def f(x):
    return table[x]


inputset = [np.random.randint(0, 4, size=(2, 3)) for _ in range(10)]
circuit = f.compile(inputset)

sample = [
    [0, 1, 3],
    [2, 3, 1],
]
expected_output = [
    [2, -1, 0],
    [3, 0, -1],
]
actual_output = circuit.encrypt_run_decrypt(np.array(sample))

assert np.array_equal(actual_output, expected_output)
from concrete import fhe
import numpy as np

squared = fhe.LookupTable([i ** 2 for i in range(4)])
cubed = fhe.LookupTable([i ** 3 for i in range(4)])

table = fhe.LookupTable([
    [squared, cubed],
    [squared, cubed],
    [squared, cubed],
])

@fhe.compiler({"x": "encrypted"})
def f(x):
    return table[x]

inputset = [np.random.randint(0, 4, size=(3, 2)) for _ in range(10)]
circuit = f.compile(inputset)

sample = [
    [0, 1],
    [2, 3],
    [3, 0],
]
expected_output = [
    [0, 1],
    [4, 27],
    [9, 0]
]
actual_output = circuit.encrypt_run_decrypt(np.array(sample))

assert np.array_equal(actual_output, expected_output)
from concrete import fhe

@fhe.compiler({"x": "encrypted"})
def f(x):
    return x ** 2

inputset = range(4)
circuit = f.compile(inputset, show_mlir = True)

assert circuit.encrypt_run_decrypt(0) == 0
assert circuit.encrypt_run_decrypt(1) == 1
assert circuit.encrypt_run_decrypt(2) == 4
assert circuit.encrypt_run_decrypt(3) == 9
from concrete import fhe
import numpy as np

table = fhe.LookupTable([i**2 for i in range(16)])
lsbs_to_remove = 1


@fhe.compiler({"x": "encrypted"})
def f(x):
    return table[fhe.truncate_bit_pattern(x, lsbs_to_remove)]


inputset = range(16)
circuit = f.compile(inputset)

for i in range(16):
    rounded_i = int(i / 2**lsbs_to_remove) * 2**lsbs_to_remove

    assert circuit.encrypt_run_decrypt(i) == rounded_i**2
from concrete import fhe
import numpy as np

table = fhe.LookupTable([i**2 for i in range(17)])
lsbs_to_remove = 1


def our_round(x):
    float_part = x - np.floor(x)
    if float_part < 0.5:
        return int(np.floor(x))
    return int(np.ceil(x))


@fhe.compiler({"x": "encrypted"})
def f(x):
    return table[fhe.round_bit_pattern(x, lsbs_to_remove)]


inputset = range(16)
circuit = f.compile(inputset)

for i in range(16):
    rounded_i = our_round(i * 1.0 / 2**lsbs_to_remove) * 2**lsbs_to_remove

    assert (
        circuit.encrypt_run_decrypt(i) == rounded_i**2
    ), f"Miscomputation {i=} {circuit.encrypt_run_decrypt(i)} {rounded_i**2}"
from concrete import fhe
import numpy as np

table = fhe.LookupTable([i**2 for i in range(17)])
lsbs_to_remove = 1


@fhe.compiler({"x": "encrypted"})
def f(x):
    return table[fhe.round_bit_pattern(x, lsbs_to_remove, exactness=fhe.Exactness.APPROXIMATE)]


inputset = range(16)
circuit = f.compile(inputset)

for i in range(16):
    lower_i = np.floor(i * 1.0 / 2**lsbs_to_remove) * 2**lsbs_to_remove
    upper_i = np.ceil(i * 1.0 / 2**lsbs_to_remove) * 2**lsbs_to_remove

    assert circuit.encrypt_run_decrypt(i) in [
        lower_i**2,
        upper_i**2,
    ], f"Miscomputation {i=} {circuit.encrypt_run_decrypt(i)} {[lower_i**2, upper_i**2]}"
from concrete import fhe

@fhe.module()
class Counter:
    @fhe.function({"x": "encrypted"})
    def inc(x):
        return x + 1 % 20

    @fhe.function({"x": "encrypted"})
    def dec(x):
        return x - 1 % 20
inputset = list(range(20))
CounterFhe = Counter.compile({"inc": inputset, "dec": inputset})
x = 5
x_enc = CounterFhe.inc.encrypt(x)
x_inc_enc = CounterFhe.inc.run(x_enc)
x_inc = CounterFhe.inc.decrypt(x_inc_enc)
assert x_inc == 6

x_inc_dec_enc = CounterFhe.dec.run(x_inc_enc)
x_inc_dec = CounterFhe.dec.decrypt(x_inc_dec_enc)
assert x_inc_dec == 5

for _ in range(10):
    x_enc = CounterFhe.inc.run(x_enc)
x_dec = CounterFhe.inc.decrypt(x_enc)
assert x_dec == 15
CounterFhe.keygen()
from concrete import fhe

def noise_reset(x):
   return fhe.univariate(lambda x: x)(x)

@fhe.module()
class Fibonacci:

    @fhe.function({"n1th": "encrypted", "nth": "encrypted"})
    def fib(n1th, nth):
       return noise_reset(nth), noise_reset(n1th + nth)

print("Compiling `Fibonacci` module ...")
inputset = list(zip(range(0, 100), range(0, 100)))
FibonacciFhe = Fibonacci.compile({"fib": inputset})

print("Generating keyset ...")
FibonacciFhe.keygen()

print("Encrypting initial values")
n1th = 1
nth = 2
(n1th_enc, nth_enc) = FibonacciFhe.fib.encrypt(n1th, nth)

print(f"|           ||        (n-1)-th       |         n-th          |")
print(f"| iteration || decrypted | cleartext | decrypted | cleartext |")
for i in range(10):
   (n1th_enc, nth_enc) = FibonacciFhe.fib.run(n1th_enc, nth_enc)
   (n1th, nth) = Fibonacci.fib(n1th, nth)

    # For demo purpose; no decryption is needed.
   (n1th_dec, nth_dec) = FibonacciFhe.fib.decrypt(n1th_enc, nth_enc)
   print(f"|     {i}     || {n1th_dec:<9} | {n1th:<9} | {nth_dec:<9} | {nth:<9} |")
Compiling `Fibonacci` module ...
Generating keyset ...
Encrypting initial values
|           ||        (n-1)-th       |         n-th          |
| iteration || decrypted | cleartext | decrypted | cleartext |
|     0     || 2         | 2         | 3         | 3         |
|     1     || 3         | 3         | 5         | 5         |
|     2     || 5         | 5         | 8         | 8         |
|     3     || 8         | 8         | 13        | 13        |
|     4     || 13        | 13        | 21        | 21        |
|     5     || 21        | 21        | 34        | 34        |
|     6     || 34        | 34        | 55        | 55        |
|     7     || 55        | 55        | 89        | 89        |
|     8     || 89        | 89        | 144       | 144       |
|     9     || 144       | 144       | 233       | 233       |
for i in some_cleartext_constant_range:
    # Do something in FHE in the loop body, implemented as an FHE function.
from concrete import fhe

@fhe.module()
class Collatz:

    @fhe.function({"x": "encrypted"})
    def collatz(x):

        y = x // 2
        z = 3 * x + 1

        is_x_odd = fhe.bits(x)[0]

        # In a fast way, compute ans = is_x_odd * (z - y) + y
        ans = fhe.multivariate(lambda b, x: b * x)(is_x_odd, z - y) + y

        is_one = ans == 1

        return ans, is_one


print("Compiling `Collatz` module ...")
inputset = [i for i in range(63)]
CollatzFhe = Collatz.compile({"collatz": inputset})

print("Generating keyset ...")
CollatzFhe.keygen()

print("Encrypting initial value")
x = 19
x_enc = CollatzFhe.collatz.encrypt(x)
is_one_enc = None

print(f"| decrypted | cleartext |")
while is_one_enc is None or not CollatzFhe.collatz.decrypt(is_one_enc):
    x_enc, is_one_enc = CollatzFhe.collatz.run(x_enc)
    x, is_one = Collatz.collatz(x)

    # For demo purpose; no decryption is needed.
    x_dec = CollatzFhe.collatz.decrypt(x_enc)
    print(f"| {x_dec:<9} | {x:<9} |")
Compiling `Collatz` module ...
Generating keyset ...
Encrypting initial value
| decrypted | cleartext |
| 58        | 58        |
| 29        | 29        |
| 88        | 88        |
| 44        | 44        |
| 22        | 22        |
| 11        | 11        |
| 34        | 34        |
| 17        | 17        |
| 52        | 52        |
| 26        | 26        |
| 13        | 13        |
| 40        | 40        |
| 20        | 20        |
| 10        | 10        |
| 5         | 5         |
| 16        | 16        |
| 8         | 8         |
| 4         | 4         |
| 2         | 2         |
| 1         | 1         |
from concrete import fhe

@fhe.module()
class Collatz:

    @fhe.function({"x": "encrypted"})
    def collatz(x):
        y = x // 2
        z = 3 * x + 1
        is_x_odd = fhe.bits(x)[0]
        ans = fhe.multivariate(lambda b, x: b * x)(is_x_odd, z - y) + y
        is_one = ans == 1
        return ans, is_one

    composition = fhe.AllComposable()
Here is an example:
from concrete import fhe
from fhe import Wired, Wire, Output, Input

@fhe.module()
class Collatz:

    @fhe.function({"x": "encrypted"})
    def collatz(x):
        y = x // 2
        z = 3 * x + 1
        is_x_odd = fhe.bits(x)[0]
        ans = fhe.multivariate(lambda b, x: b * x)(is_x_odd, z - y) + y
        is_one = ans == 1
        return ans, is_one

    composition = Wired(
        {
            Wire(Output(collatz, 0), Input(collatz, 0)
        }
    )
    composition = Wired(
        {
            Wire(AllOutputs(collatz), AllInputs(collatz))
        }
    )
from concrete import fhe
from fhe import Wired

@fhe.module()
class MyModule:
    @fhe.function({"x": "encrypted"})
    def increment(x):
        return (x + 1) % 100

    @fhe.function({"x": "encrypted"})
    def decrement(x):
        return (x - 1) % 100

    @fhe.function({"x": "encrypted"})
    def decimate(x):
        return (x / 10) % 100

    composition = fhe.Wired()
# A single inputset used during tracing is defined
inputset = [np.random.randint(1, 100, size=()) for _ in range(100)]

# The inputset is passed to the `wire_pipeline` method, which itself returns an iterator over the inputset samples.
with MyModule.wire_pipeline(inputset) as samples_iter:

    # The inputset is iterated over
    for s in samples_iter:

        # Here we provide an example of how we expect the module functions to be used at runtime in fhe.
        MyModule.increment(MyModule.decimate(MyModule.decrement(s)))

# It is not needed to provide any inputsets to the `compile` method after tracing the wires, since those were already computed automatically during the module tracing.
module = MyModule.compile(
    p_error=0.01,
)
@fhe.module()
class Doubler:
    @fhe.compiler({"counter": "encrypted"})
    def double(counter):
       return counter * 2
@fhe.module()
class Doubler:
    @fhe.compiler({"counter": "encrypted"})
    def double(counter):
       return fhe.refresh(counter * 2)
from concrete import fhe

@fhe.compiler({"counter": "encrypted"})
def increment(counter):
   return (counter + 1) % 100

print("Compiling `increment` function")
increment_fhe = increment.compile(list(range(0, 100)), composable=True)

print("Generating keyset ...")
increment_fhe.keygen()

print("Encrypting the initial counter value")
counter = 0
counter_enc = increment_fhe.encrypt(counter)

print(f"| iteration || decrypted | cleartext |")
for i in range(10):
    counter_enc = increment_fhe.run(counter_enc)
    counter = increment(counter)

    # For demo purpose; no decryption is needed.
    counter_dec = increment_fhe.decrypt(counter_enc)
    print(f"|     {i}     || {counter_dec:<9} | {counter:<9} |")
community forum
Table Lookup advanced
Click here
Collatz sequence
composition documentation
modules documentation
fhe.univariate
fhe.multivariate
performance
TLU advanced documentation
truncating
rounding
modules
limitations documentation

Quick start

This document covers how to compute on encrypted data homomorphically using the Concrete framework. We will walk you through a complete example step-by-step.

The basic workflow of computation is as follows:

  1. Define the function you want to compute

  2. Compile the function into a Concrete Circuit

  3. Use the Circuit to perform homomorphic evaluation

Here is the complete example, which we will explain step by step in the following paragraphs.

from concrete import fhe

def add(x, y):
    return x + y

compiler = fhe.Compiler(add, {"x": "encrypted", "y": "encrypted"})

inputset = [(2, 3), (0, 0), (1, 6), (7, 7), (7, 1), (3, 2), (6, 1), (1, 7), (4, 5), (5, 4)]

print(f"Compilation...")
circuit = compiler.compile(inputset)

print(f"Key generation...")
circuit.keygen()

print(f"Homomorphic evaluation...")
encrypted_x, encrypted_y = circuit.encrypt(2, 6)
encrypted_result = circuit.run(encrypted_x, encrypted_y)
result = circuit.decrypt(encrypted_result)

assert result == add(2, 6)

Decorator

Another simple way to compile a function is to use a decorator.

from concrete import fhe

@fhe.compiler({"x": "encrypted"})
def f(x):
    return x + 42

inputset = range(10)
circuit = f.compile(inputset)

assert circuit.encrypt_run_decrypt(10) == f(10)

This decorator is a way to add the compile method to the function object without changing its name elsewhere.

Importing the library

Import the fhe module, which includes everything you need to perform homomorphic evaluation:

from concrete import fhe

Defining the function to compile

Here we define a simple addition function:

def add(x, y):
    return x + y

Creating a compiler

To compile the function, you first need to create a Compiler by specifying the function to compile and the encryption status of its inputs:

compiler = fhe.Compiler(add, {"x": "encrypted", "y": "encrypted"})

For instance, to set the input y as clear:

compiler = fhe.Compiler(add, {"x": "encrypted", "y": "clear"})

Defining an inputset

An inputset is a collection representing the typical inputs of the function. It is used to determine the bit widths and shapes of the variables within the function.

The inputset should be an iterable that yields tuples of the same length as the number of arguments of the compiled function.

For example:

inputset = [(2, 3), (0, 0), (1, 6), (7, 7), (7, 1), (3, 2), (6, 1), (1, 7), (4, 5), (5, 4)]

Here, our inputset consists of 10 integer pairs, ranging from a minimum of (0, 0) to a maximum of (7, 7).

Compiling the function

Use the compile method of the Compiler class with an inputset to perform the compilation and get the resulting circuit:

print(f"Compilation...")
circuit = compiler.compile(inputset)

Generating the keys

Use the keygen method of the Circuit class to generate the keys (public and private):

print(f"Key generation...")
circuit.keygen()

If you don't call the key generation explicitly, keys will be generated lazily when needed.

Performing homomorphic evaluation

Now you can easily perform the homomorphic evaluation using the encrypt, run and decrypt methods of the Circuit:

print(f"Homomorphic evaluation...")
encrypted_x, encrypted_y = circuit.encrypt(2, 6)
encrypted_result = circuit.run(encrypted_x, encrypted_y)
result = circuit.decrypt(encrypted_result)

Zama 5-Question Developer Survey

Common errors

This document explains the most common errors and provides solutions to fix them.

1. Could not find a version that satisfies the requirement concrete-python (from versions: none)

Error message: Could not find a version that satisfies the requirement concrete-python (from versions: none)

Cause: The installation does not work fine for you.

Possible solutions:

  • Be sure that you use a supported Python version (currently from 3.8 to 3.11, included).

  • Check that you have done pip install -U pip wheel setuptools before.

  • Consider adding a --extra-index-url https://pypi.zama.ai/cpu or --extra-index-url https://pypi.zama.ai/gpu, depending on whether you want the CPU or the GPU wheel.

  • Concrete requires glibc>=2.28, be sure to have a sufficiently recent version.

2. Only integers are supported

Error message: RuntimeError: Function you are trying to compile cannot be compiled with extra information only integers are supported

Cause: Parts of your program contain graphs that are not from integer to integer

Possible solutions:

3. No parameters found

Error message: NoParametersFound

Cause: The optimizer can't find cryptographic parameters for the circuit that are both secure and correct.

Possible solutions:

  • Try to simplify your circuit.

  • Use smaller weights.

  • Add intermediate PBS to reduce the noise, with identity function fhe.refresh(lambda x: x).

4. Too long inputs for table looup

Error message: RuntimeError: Function you are trying to compile cannot be compiled, with extra information as this [...]-bit value is used as an input to a table lookup with but only up to 16-bit table lookups are supported

Cause: The program uses a Table Lookup that contains oversized inputs exceeding the current 16-bit limit.

Possible solutions:

  • Try to simplify your circuit.

  • Use smaller weights.

  • Look to the graph to understand where this oversized input comes from and ensure that the input size for Table Lookup operations does not exceed 16 bits.

  • Use show_bit_width_constraints=True to understand bit widths are assigned the way they are.

5. Impossible to fuse multiple-nodes

Error message: RuntimeError: A subgraph within the function you are trying to compile cannot be fused because it has multiple input nodes

Cause: A subgraph in your program uses two or more input nodes. It is impossible to fuse such a graph, meaning replace it by a table lookup. Concrete will indicate the input nodes with this is one of the input nodes printed in the circuit.

Possible solutions:

  • Try to simplify your circuit.

  • Have a look to fhe.multivariate.

6. Function is not supported

Error message: RuntimeError: Function '[...]' is not supported

Cause: The function used is not currently supported by Concrete.

Possible solutions:

  • Try to change your program.

  • Check the corresponding documentation to see if there are ways to implement the function differently.

7. Branching is not allowed

Error message: RuntimeError: Branching within circuits is not possible

Cause: Branching operations, such as if statements or non-constant loops, are not supported in Concrete's FHE programs.

Possible solutions:

  • Change your program.

  • Consider using tricks to replace ternary-if, as c ? t : f = f + c * (t-f).

8. Unfeasible noise constraint

Error message: Unfeasible noise constraint encountered

Cause: The optimizer can't find cryptographic parameters for the circuit that are both secure and correct.

Possible solutions:

  • Try to simplify your circuit.

  • Use smaller weights.

  • Add intermediate PBS to reduce the noise, with identity function fhe.refresh(x).

9. Non composable circuit

Error message: Program can not be composed

Cause: Some circuit outputs are contaminated by unrefreshed input noise.

Possible solutions:

  • Add intermediate PBS to refresh the noise with fhe.refresh(x).

Compression

This document explains the compression feature in Concrete and its performance impact.

Fully Homomorphic Encryption (FHE) needs both ciphertexts (encrypted data) and evaluation keys to carry out the homomorphic evaluation of a function. Both elements are large, which may critically affect the application's performance depending on the use case, application deployment, and the method for transmitting and storing ciphertexts and evaluation keys.

Enabling compression

During compilation, you can enable compression options to enforce the use of compression features. The two available compression options are:

  • compress_evaluation_keys: bool = False,

    • This specifies that serialization takes the compressed form of evaluation keys.

  • compress_input_ciphertexts: bool = False,

    • This specifies that serialization takes the compressed form of input ciphertexts.

You can see the impact of compression by comparing the size of the serialized form of input ciphertexts and evaluation keys with a sample code:

Compression algorithms

The compression factor largely depends on the cryptographic parameters identified and the compression algorithms selected during the compilation.

Currently, Concrete uses the seeded compression algorithms. These algorithms rely on the fact that CSPRNGs are deterministic. Consequently, the chain of random values can be replaced by the seed and later recalculated using the same seed.

Typically, the size of a ciphertext is (lwe dimension + 1) * 8 bytes, while the size of a seeded ciphertext is constant, equal to 3 * 8 bytes. Thus, the compression factor ranges from a hundred to thousands. Understanding the compression factor of evaluation keys is complex. The compression factor of evaluation keys typically ranges between 0 and 10.

Please note that while compression may save bandwidth and disk space, it incurs the cost of decompression. Currently, decompression occur more or less lazily during FHE evaluation without any control.

Performance

This document shows some basic things you can do to improve the performance of your circuit.

Here are some quick tips to reduce the execution time of your circuit:

  • Use tensors as much as possible in your circuits.

  • Tweak p_error configuration option until you get optimal exactness vs performance tradeoff for your application.

Simulation

This document explains how to use simulation to speed up the development, enabling faster prototyping while accounting for the inherent probability of errors in Fully Homomorphic Encryption (FHE) execution.

Using simulation for faster prototyping

To overcome this issue, simulation is introduced:

After the simulation runs, it prints the following results:

Overflow detection in simulation

Overflow can happen during an FHE computation, leading to unexpected behaviors. Using simulation can help you detect these events by printing a warning whenever an overflow happens. This feature is disabled by default, but you can enable it by setting detect_overflow_in_simulation=True during compilation.

To demonstrate, we will compile the previous circuit with overflow detection enabled and trigger an overflow:

You will see the following warning after the simulation call:

If you look at the MLIR (circuit.mlir), you will see that the input type is supposed to be eint4 represented in 4 bits with a maximum value of 15. Since there's an addition of the input, we used the maximum value (15) here to trigger an overflow (15 + 1 = 16 which needs 5 bits). The warning specifies the operation that caused the overflow and its location. Similar warnings will be displayed for all basic FHE operations such as add, mul, and lookup tables.

GPU acceleration

This document explains how to use GPU accelerations with Concrete.

Concrete supports acceleration using one or more GPUs.

pip install concrete-python --extra-index-url https://pypi.zama.ai/gpu.

Our GPU wheels are built with CUDA 11.8 and should be compatible with higher versions of CUDA.

GPU execution configuration

By default the compiler and runtime will use all available system resources, including all CPU cores and GPUs. You can adjust this by using the following environment variables:

SDFG_NUM_THREADS

  • Type: Integer

  • Default value: The number of hardware threads on the system (including hyperthreading) minus the number of GPUs in use.

  • Description: This variable determines the number of CPU threads that execute in paralelle with the GPU for offloadable workloads. GPU scheduler threads (including CUDA threads and those used within Concrete) are necessary but can block or interfere with worker thread execution. Therefore, it is recommended to undersubscribe the CPU hardware threads by the number of GPU devices used.

SDFG_NUM_GPUS

  • Type: Integer

  • Default value: The number of GPUs available.

  • Description: This value determines the number of GPUs to use for offloading. This can be set to any value between 1 and the total number of GPUs on the system.

SDFG_MAX_BATCH_SIZE**

  • Type: Integer

  • Default value: LLONG_MAX (no batch size limit)

  • Description: This value limits the maximum batch size for offloading in cases where the GPU memory is insufficient.

SDFG_DEVICE_TO_CORE_RATIO

  • Type: Integer

  • Default value: The ratio between the compute capability of the GPU (at index 0) and a CPU core.

  • Description: This ratio is used to balance the load between the CPU and GPU. If the GPU is underutilized, set this value higher to increase the amount of work offloaded to the GPU.

OMP_NUM_THREADS

  • Type: Integer

  • Default value: The number of hardware threads on the system, including hyperthreading.

  • Description: This value specifies the portions of program execution that are not yet supported for GPU offload, which will be parallelized using OpenMP on the CPU.

Choosing a representative inputset is critical to allow the compiler to find accurate bounds of all the intermediate values (see more details ). Evaluating the circuit with input values under or over the bounds may result in undefined behavior.

You can use the fhe.inputset(...) function to easily create random inputsets, see more details in .

We want to hear from you! Take 1 minute to share your thoughts and helping us enhance our documentation and libraries. 👉 to participate.

You can use floats as intermediate values (see the ). However, both inputs and outputs must be integers. Consider converting values to integers, such as .astype(np.uint64)

Post your issue in our .

Reduce the amount of in the circuit.

Try different implementation strategies for .

Utilize and if your application doesn't require precise execution.

Enable dataflow parallelization, by setting dataflow_parallelize=True in the .

Specify composition when using .

You can refer to our full for detailed examples of how to do each of these, and more!

During development, the speed of homomorphic execution can be a blocker for fast prototyping. Although you can directly call the function you want to compile, this approach does not fully replicate FHE execution, which involves a certain probability of error (see ).

This version is not available on , which only hosts wheels with CPU support.

To use GPU acceleration, install the GPU/CUDA wheel from our using the following command:

After installing the GPU/CUDA wheel, you must the FHE program compilation to enable GPU offloading using the use_gpu option.

here
Click here
this documentation
from concrete import fhe
def test_compression(compression):
    @fhe.compiler({"counter": "encrypted"})
    def f(counter):
       return counter // 2

    circuit = f.compile(fhe.inputset(fhe.tensor[fhe.uint2, 3]),
                        compress_evaluation_keys=compression,
                        compress_input_ciphertexts=compression)

    print(f"Sizes with compression = {compression}")
    print(f" - of the input ciphertext {len(circuit.encrypt(list([0 for i in range(3)])).serialize())}")
    print(f" - of the evaluation keys {len(circuit.keys.serialize())}")

test_compression(False)
test_compression(True)
from concrete import fhe
import numpy as np

@fhe.compiler({"x": "encrypted"})
def f(x):
    return (x + 1) ** 2

inputset = [np.random.randint(0, 10, size=(10,)) for _ in range(10)]
circuit = f.compile(inputset, p_error=0.1, fhe_simulation=True)

sample = np.array([0, 1, 2, 3, 4, 5, 6, 7, 8, 9])

actual = f(sample)
simulation = circuit.simulate(sample)

print(actual.tolist())
print(simulation.tolist())
[1, 4, 9, 16, 25, 36, 49, 64, 81, 100]
[1, 4, 9, 16, 16, 36, 49, 64, 81, 100]
# compile with overflow detection enabled
circuit = f.compile(inputset, p_error=0.1, fhe_simulation=True, detect_overflow_in_simulation=True)
# cause an overflow
circuit.simulate([0,1,2,3,4,5,6,7,8,15])
WARNING at loc("script.py":3:0): overflow happened during addition in simulation

Manage keys

This document explains how to manage keys when using Concrete, introducing the key management API for generating, reusing, and securely handling keys.

Concrete generates keys lazily when needed. While this is convenient for development, it's not ideal for the production environment. The explicit key management API is available for you to easily generate and reuse keys as needed.

Definition

Let's start by defining a circuit with the following example:

from concrete import fhe

@fhe.compiler({"x": "encrypted"})
def f(x):
    return x ** 2

inputset = range(10)
circuit = f.compile(inputset)

Circuits have a keys property of type fhe.Keys, which includes several utilities for key management.

Generation

To explicitly generate keys for a circuit, use:

circuit.keys.generate()

Generated keys are stored in memory and remain unencrypted.

You can also set a custom seed for reproducibility:

circuit.keys.generate(seed=420)

Do not specify the seed manually in a production environment! This is not secure and should only be done for debugging purposes.

Serialization

To serialize keys, for tasks such as sending them across a network, use:

serialized_keys: bytes = circuit.keys.serialize()

Keys are not serialized in encrypted form. Please make sure you keep them in a safe environment, or encrypt them manually after serialization.

Deserialization

To deserialize the keys back after receiving serialized keys, use:

keys: fhe.Keys = fhe.Keys.deserialize(serialized_keys)

Assignment

Once you have a valid fhe.Keys object, you can directly assign it to the circuit:

circuit.keys = keys

If assigned keys are generated for a different circuit, an exception will be raised.

Saving

You can also use the filesystem to store the keys directly, without managing serialization and file management manually:

circuit.keys.save("/path/to/keys")

Keys are not saved in encrypted form. Please make sure you store them in a safe environment, or encrypt them manually after saving.

Loading

After saving keys to disk, you can load them back using:

circuit.keys.load("/path/to/keys")

Automatic Management

If you want to generate keys in the first run and reuse the keys in consecutive runs, use:

circuit.keys.load_if_exists_generate_and_save_otherwise("/path/to/keys")

Serialization

This document explains how to serialize and deserialize ciphertexts and secret keys when working with TFHE-rs in Rust.

Concrete already has its serilization functions (e.g. tfhers_bridge.export_value, tfhers_bridge.import_value, tfhers_bridge.keygen_with_initial_keys, tfhers_bridge.serialize_input_secret_key). However, when implementing a TFHE-rs computation in Rust, we must use a compatible serialization.

Ciphertexts

We can deserialize FheUint8 (and similarly other types) using bincode

use tfhe::FheUint8;

/// ...

fn load_fheuint8(path: &String) -> FheUint8 {
    let path_fheuint: &Path = Path::new(path);
    let serialized_fheuint = fs::read(path_fheuint).unwrap();
    let mut serialized_data = Cursor::new(serialized_fheuint);
    bincode::deserialize_from(&mut serialized_data).unwrap()
}

To serialize

fn save_fheuint8(fheuint: FheUint8, path: &String) {
    let mut serialized_ct = Vec::new();
    bincode::serialize_into(&mut serialized_ct, &fheuint).unwrap();
    let path_ct: &Path = Path::new(path);
    fs::write(path_ct, serialized_ct).unwrap();
}

Secret Key

We can deserialize LweSecretKey using bincode

use tfhe::core_crypto::prelude::LweSecretKey;

/// ...

fn load_lwe_sk(path: &String) -> LweSecretKey<Vec<u64>> {
    let path_sk: &Path = Path::new(path);
    let serialized_lwe_key = fs::read(path_sk).unwrap();
    let mut serialized_data = Cursor::new(serialized_lwe_key);
    bincode::deserialize_from(&mut serialized_data).unwrap()
}

To serialize

fn save_lwe_sk(lwe_sk: LweSecretKey<Vec<u64>>, path: &String) {
    let mut serialized_lwe_key = Vec::new();
    bincode::serialize_into(&mut serialized_lwe_key, &lwe_sk).unwrap();
    let path_sk: &Path = Path::new(path);
    fs::write(path_sk, serialized_lwe_key).unwrap();
}
community channels
table lookups
rounding
truncating
configuration
Optimization Guide
pypi.org
Zama public PyPI repository
configure
complex operations
modules
Exactness
deployed in a single artifact

Debugging and artifact

This document provides guidance on debugging the compilation process.

Compiler debug and verbose modes

  • compiler_verbose_mode: Prints the compiler passes and shows the transformations applied. It can help identify the crash location if a crash occurs.

  • compiler_debug_mode: A more detailed version of the verbose mode, providing additional information, particularly useful for diagnosing crashes.

These flags might not work as expected in Jupyter notebooks as they output to stderr directly from C++.

Debug artifacts

Concrete includes an artifact system that simplifies the debugging process by automatically or manually exporting detailed information during compilation failures.

Automatic export

When a compilation fails, artifacts are automatically exported to the .artifacts directory in the working directory. Here's an example of what gets exported when a function fails to compile:

def f(x):
    return np.sin(x)

This function fails to compile because Concrete does not support floating-point outputs. When you try to compile it, an exception will be raised and the artifacts will be exported automatically. The following files will be generated in the .artifacts directory:

  • environment.txt: Information about your system setup, including the operating system and Python version.

Linux-5.12.13-arch1-2-x86_64-with-glibc2.29 #1 SMP PREEMPT Fri, 25 Jun 2021 22:56:51 +0000
Python 3.8.10
  • requirements.txt: The installed Python packages and their versions.

astroid==2.15.0
attrs==22.2.0
auditwheel==5.3.0
...
wheel==0.40.0
wrapt==1.15.0
zipp==3.15.0
  • function.txt: The code of the function that failed to compile.

def f(x):
    return np.sin(x)
  • parameters.txt: Information about the encryption status function's parameters.

x :: encrypted
  • 1.initial.graph.txt: The textual representation of the initial computation graph right after tracing.

%0 = x              # EncryptedScalar<uint3>
%1 = sin(%0)        # EncryptedScalar<float64>
return %1
  • final.graph.txt: The textual representation of the final computation graph right before MLIR conversion.

%0 = x              # EncryptedScalar<uint3>
%1 = sin(%0)        # EncryptedScalar<float64>
return %1
  • traceback.txt: Details of the error occurred.

Traceback (most recent call last):
  File "/path/to/your/script.py", line 9, in <module>
    circuit = f.compile(inputset)
  File "/usr/local/lib/python3.10/site-packages/concrete/fhe/compilation/decorators.py", line 159, in compile
    return self.compiler.compile(inputset, configuration, artifacts, **kwargs)
  File "/usr/local/lib/python3.10/site-packages/concrete/fhe/compilation/compiler.py", line 437, in compile
    mlir = GraphConverter.convert(self.graph)
  File "/usr/local/lib/python3.10/site-packages/concrete/fhe/mlir/graph_converter.py", line 677, in convert
    GraphConverter._check_graph_convertibility(graph)
  File "/usr/local/lib/python3.10/site-packages/concrete/fhe/mlir/graph_converter.py", line 240, in _check_graph_convertibility
    raise RuntimeError(message)
RuntimeError: Function you are trying to compile cannot be converted to MLIR

%0 = x              # EncryptedScalar<uint3>          ∈ [3, 5]
%1 = sin(%0)        # EncryptedScalar<float64>        ∈ [-0.958924, 0.14112]
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ only integer operations are supported
                                                                             /path/to/your/script.py:6
return %1

Manual exports

Manual exports are mostly used for visualization and demonstrations. Here is how to perform one:

from concrete import fhe
import numpy as np

artifacts = fhe.DebugArtifacts("/tmp/custom/export/path")

@fhe.compiler({"x": "encrypted"})
def f(x):
    return 127 - (50 * (np.sin(x) + 1)).astype(np.int64)

inputset = range(2 ** 3)
circuit = f.compile(inputset, artifacts=artifacts)

artifacts.export()

After running the code, you will find the following files under /tmp/custom/export/path directory:

  • 1.initial.graph.txt: The textual representation of the initial computation graph right after tracing.

%0 = x                             # EncryptedScalar<uint1>
%1 = sin(%0)                       # EncryptedScalar<float64>
%2 = 1                             # ClearScalar<uint1>
%3 = add(%1, %2)                   # EncryptedScalar<float64>
%4 = 50                            # ClearScalar<uint6>
%5 = multiply(%4, %3)              # EncryptedScalar<float64>
%6 = astype(%5, dtype=int_)        # EncryptedScalar<uint1>
%7 = 127                           # ClearScalar<uint7>
%8 = subtract(%7, %6)              # EncryptedScalar<uint1>
return %8
  • 2.after-fusing.graph.txt: The textual representation of the intermediate computation graph after fusing.

%0 = x                       # EncryptedScalar<uint1>
%1 = subgraph(%0)            # EncryptedScalar<uint1>
%2 = 127                     # ClearScalar<uint7>
%3 = subtract(%2, %1)        # EncryptedScalar<uint1>
return %3

Subgraphs:

    %1 = subgraph(%0):

        %0 = input                         # EncryptedScalar<uint1>
        %1 = sin(%0)                       # EncryptedScalar<float64>
        %2 = 1                             # ClearScalar<uint1>
        %3 = add(%1, %2)                   # EncryptedScalar<float64>
        %4 = 50                            # ClearScalar<uint6>
        %5 = multiply(%4, %3)              # EncryptedScalar<float64>
        %6 = astype(%5, dtype=int_)        # EncryptedScalar<uint1>
        return %6
  • 3.final.graph.txt: The textual representation of the final computation graph right before MLIR conversion.

%0 = x                       # EncryptedScalar<uint3>        ∈ [0, 7]
%1 = subgraph(%0)            # EncryptedScalar<uint7>        ∈ [2, 95]
%2 = 127                     # ClearScalar<uint7>            ∈ [127, 127]
%3 = subtract(%2, %1)        # EncryptedScalar<uint7>        ∈ [32, 125]
return %3

Subgraphs:

    %1 = subgraph(%0):

        %0 = input                         # EncryptedScalar<uint1>
        %1 = sin(%0)                       # EncryptedScalar<float64>
        %2 = 1                             # ClearScalar<uint1>
        %3 = add(%1, %2)                   # EncryptedScalar<float64>
        %4 = 50                            # ClearScalar<uint6>
        %5 = multiply(%4, %3)              # EncryptedScalar<float64>
        %6 = astype(%5, dtype=int_)        # EncryptedScalar<uint1>
        return %6
  • mlir.txt: Information about the MLIR of the function which was compiled using the provided input-set.

module {
  func.func @main(%arg0: !FHE.eint<7>) -> !FHE.eint<7> {
    %c127_i8 = arith.constant 127 : i8
    %cst = arith.constant dense<"..."> : tensor<128xi64>
    %0 = "FHE.apply_lookup_table"(%arg0, %cst) : (!FHE.eint<7>, tensor<128xi64>) -> !FHE.eint<7>
    %1 = "FHE.sub_int_eint"(%c127_i8, %0) : (i8, !FHE.eint<7>) -> !FHE.eint<7>
    return %1 : !FHE.eint<7>
  }
}
  • client\_parameters.json: Information about the client parameters chosen by Concrete.

{
    "bootstrapKeys": [
        {
            "baseLog": 22,
            "glweDimension": 1,
            "inputLweDimension": 908,
            "inputSecretKeyID": 1,
            "level": 1,
            "outputSecretKeyID": 0,
            "polynomialSize": 8192,
            "variance": 4.70197740328915e-38
        }
    ],
    "functionName": "main",
    "inputs": [
        {
            "encryption": {
                "encoding": {
                    "isSigned": false,
                    "precision": 7
                },
                "secretKeyID": 0,
                "variance": 4.70197740328915e-38
            },
            "shape": {
                "dimensions": [],
                "sign": false,
                "size": 0,
                "width": 7
            }
        }
    ],
    "keyswitchKeys": [
        {
            "baseLog": 3,
            "inputSecretKeyID": 0,
            "level": 6,
            "outputSecretKeyID": 1,
            "variance": 1.7944329123150665e-13
        }
    ],
    "outputs": [
        {
            "encryption": {
                "encoding": {
                    "isSigned": false,
                    "precision": 7
                },
                "secretKeyID": 0,
                "variance": 4.70197740328915e-38
            },
            "shape": {
                "dimensions": [],
                "sign": false,
                "size": 0,
                "width": 7
            }
        }
    ],
    "packingKeyswitchKeys": [],
    "secretKeys": [
        {
            "dimension": 8192
        },
        {
            "dimension": 908
        }
    ]
}

Asking the community

Submitting an issue

  • Avoid randomness to ensure reproductibility of the bug

  • Minimize your function while keeping the bug to expedite the fix

  • Include your input-set in the issue

  • Provide clear reproduction steps

  • Include debug artifacts in the issue

  • Give a minimal example of the desired behavior

  • Explain your use case

Deploy

Deploy

This document explains how to deploy a circuit after the development. After developing your circuit, you may want to deploy it without sharing the circuit's details with every client or hosting computations on dedicated servers. In this scenario, you can use the Client and Server features of Concrete.

Deployment process

In a typical Concrete deployment:

  • The server hosts the compilation artifact, including client specifications and the FHE executable.

  • The client requests circuit requirements, generates keys, sends an encrypted payload, and receives an encrypted result.

Example

Follow these steps to deploy your circuit:

  1. Develop the circuit: You can develop your own circuit using the techniques discussed in previous chapters. Here is an example.

from concrete import fhe

@fhe.compiler({"x": "encrypted"})
def function(x):
    return x + 42

inputset = range(10)
circuit = function.compile(inputset)
  1. Save the server files: Once you have your circuit, save the necessary server files.

circuit.server.save("server.zip")
  1. Send the server files: Send server.zip to your computation server.

Setting up a server

  1. Load the server files: To set up the server, load the server.zip file received from the development machine.

from concrete import fhe

server = fhe.Server.load("server.zip")
  1. Prepare for client requests: The server needs to wait for the requests from clients.

  2. Serialize ClientSpecs: The requests typically starts with ClientSpecs as clients need ClientSpecs to generate keys and request computation.

serialized_client_specs: str = server.client_specs.serialize()
  1. Send serialized ClientSpecs to clients.

Setting up clients

  1. Create the client object: After receiving the serialized ClientSpecs from a server, create the Client object.

client_specs = fhe.ClientSpecs.deserialize(serialized_client_specs)
client = fhe.Client(client_specs)

Generating keys (client-side)

  1. Generate keys: Once you have the Client object, perform key generation. This method generates encryption/decryption keys and evaluation keys.

client.keys.generate()
  1. Serialize the evaluation keys: The server needs access to the evaluation keys. You can serialize your evaluation keys as below.

serialized_evaluation_keys: bytes = client.evaluation_keys.serialize()
  1. Send the evaluation keys to the server.

Encrypting inputs (client-side)

  1. Encrypt inputs: Encrypt your inputs and request the server to perform some computation. This can be done in the following way.

arg: fhe.Value = client.encrypt(7)
serialized_arg: bytes = arg.serialize()
  1. Send the serialized arguments to the server.

Performing computation (server-side)

  1. Deserialize received data: On the server, deserialize the received evaluation keys and arguments received from the client.

deserialized_evaluation_keys = fhe.EvaluationKeys.deserialize(serialized_evaluation_keys)
deserialized_arg = fhe.Value.deserialize(serialized_arg)
  1. Run the computation: Perform the computation and serialize the result.

result: fhe.Value = server.run(deserialized_arg, evaluation_keys=deserialized_evaluation_keys)
serialized_result: bytes = result.serialize()
  1. Send the serialized result to the client:

Clear arguments can directly be passed to server.run (For example, server.run(x, 10, z, evaluation_keys=...)).

Decrypting the result (on the client)

  1. Deserialize the result: Once you receive the serialized result from the server, deserialize it.

deserialized_result = fhe.Value.deserialize(serialized_result)
  1. Decrypt the deserialized result:

decrypted_result = client.decrypt(deserialized_result)
assert decrypted_result == 49

Deployment of modules

For example, consider a module compiled in the following way:

from concrete import fhe

@fhe.module()
class MyModule:
    @fhe.function({"x": "encrypted"})
    def inc(x):
        return x + 1

    @fhe.function({"x": "encrypted"})
    def dec(x):
        return x - 1

inputset = list(range(20))
my_module = MyModule.compile({"inc": inputset, "dec": inputset})
)

You can extract the server from the module and save it in a file:

my_module.server.save("server.zip")

The only noticeable difference between the deployment of modules and the deployment of circuits is that the methods Client::encrypt, Client::decrypt and Server::run must contain an extra function_name argument specifying the name of the targeted function.

For example, to encrypt an argument for the inc function of the module:

arg = client.encrypt(7, function_name="inc")
serialized_arg = arg.serialize()

To execute the inc function:

result = server.run(deserialized_arg, evaluation_keys=deserialized_evaluation_keys, function_name="inc")
serialized_result = result.serialize()

To decrypt a result from the execution of dec:

decrypted_result = client.decrypt(deserialized_result, function_name="dec")

Optimization

This guide explains how to optimize Concrete circuits extensively.

It's split in 3 sections:

Improve parallelism

This guide introduces the different options for parallelism in Concrete and how to utilize them to improve the execution time of Concrete circuits.

Modern CPUs have multiple cores to perform computation and utilizing multiple cores is a great way to boost performance.

There are two kinds of parallelism in Concrete:

Loop parallelism is enabled by default, as it's supported on all platforms. Dataflow parallelism however is only supported on Linux, hence not enabled by default.

TFHE-rs Interoperability

This feature is currently in beta version. Please note that the API may change in future Concrete releases.

Overview

There are differences between Concrete and TFHE-rs, so ensuring compatibility between them involves more than just data serialization. To achieve compatibility, we need to consider two main aspects.

Encoding differences

Both TFHE-rs and Concrete libraries use Learning with errors(LWE) ciphertexts, but integers are encoded differently:

  • In Concrete, integers are simply encoded in a single ciphertext

  • In TFHE-rs, integers are encoded into multiple ciphertext using radix decomposition

Converting between Concrete and TFHE-rs encrypted integers then require doing an encrypted conversion between the two different encodings.

When working with a TFHE-rs integer type in Concrete, you can use the .encode(...) and .decode(...) functions to see this in practice:

from concrete.fhe import tfhers

# don't worry about the API, we will have better examples later.
# we just want to show the encoding here
tfhers_params = tfhers.CryptoParams(
lwe_dimension=909,
glwe_dimension=1,
polynomial_size=4096,
pbs_base_log=15,
pbs_level=2,
lwe_noise_distribution=9.743962418842052e-07,
glwe_noise_distribution=2.168404344971009e-19,
encryption_key_choice=tfhers.EncryptionKeyChoice.BIG,
)

# TFHERSInteger using this type will be represented as a vector of 8/2=4 integers
tfhers_type = tfhers.TFHERSIntegerType(
    is_signed=False,
    bit_width=8,
    carry_width=3,
    msg_width=2,
    params=tfhers_params,
)

assert (tfhers_type.encode(123) == [3, 2, 3, 1]).all()

assert tfhers_type.decode([3, 2, 3, 1]) == 123

Parameter match

The Concrete Optimizer may find parameters which are not in TFHE-rs's pre-computed list. To ensure compatibility, you need to either fix or constrain the search space in parts of the circuit where compatibility is required. This ensures that compatible parameters are used consistently.

Scenarios

There are 2 different approaches to using Concrete and THFE-rs depending on the situation.

Serialization of ciphertexts and keys

Reducing TLU

This guide teaches how to improve the execution time of Concrete circuits by reducing the amount of table lookups.

Reducing the amount of table lookups is probably the most complicated guide in this section as it's not automated. The idea is to use mathematical properties of operations to reduce the amount of table lookups needed to achieve the result.

One great example is in adding big integers in bitmap representation. Here is the basic implementation:

There are two table lookups within the loop body, one for >> and one for %.

This implementation is not optimal though, since the same output can be achieved with just a single table lookup:

It was possible to do this because the original operations had a mathematical equivalence with the optimized operations and optimized operations achieved the same output with less table lookups!

Here is the full code example and some numbers for this optimization:

prints:

which is almost half the amount of table lookups and ~2x less complexity for the same operation!

Two options are available to help you understand the compilation process:

You can seek help with your issue by asking a question directly in the .

If you cannot find a solution in the community forum, or if you have found a bug in the library, you could in our GitHub repository.

For , try to:

For , try to:

Serialized evaluation keys are very large, even if they are and can be reused several times: consider caching them on the server

Deploying a follows the same logic as the deployment of circuits.

: to make circuits utilize more cores.

: to optimize the most expensive operation in Concrete.

: to make Concrete select more performant parameters.

Loop parallelism to make tensor operations parallel, achieved by using

Dataflow parallelism to make independent operations parallel, achieved by using

This guide explains how to combine Concrete and computations together. This allows you to convert ciphertexts from Concrete to TFHE-rs, and vice versa, and to run a computation with both libraries without requiring a decryption.

: In this scenario, a single party aims to combine both Concrete and TFHE-rs in a computation. In this case, a shared secret key will be used, while different keysets will be held for Concrete and TFHE-rs.

: This scenario involves two parties, each with a pre-established set of TFHE-rs keysets. The objective is to compute on encrypted TFHE-rs data using Concrete. In this case, there is no shared secret key. The party using Concrete will rely solely on TFHE-rs public keys and must optimize the parameters accordingly, while the party using TFHE-rs handles encryption, decryption, and computation.

Concrete already has its serilization functions (such as tfhers_bridge.export_value, tfhers_bridge.import_value, tfhers_bridge.keygen_with_initial_keys, tfhers_bridge.serialize_input_secret_key, and so on). However, when implementing a TFHE-rs computation in Rust, we must use a compatible serialization. Learn more in .

configuration
community forum
create an issue
bug reports
feature requests
compressed
module
Improve parallelism
Optimize table lookups
Optimize cryptographic parameters
OpenMP
HPX
TFHE-rs
Scenario 1: Shared secret key
Scenario 2: Pregenerated TFHE-rs keys
Serialization of ciphertexts and keys
def add_bitmaps(x, y):
    result = fhe.zeros((N,))
    carry = 0

    addition = x + y
    for i in range(N):
        addition_and_carry = addition[i] + carry
        carry = addition_and_carry >> 1
        result[i] = addition_and_carry % 2

    return result
def add_bitmaps(x, y):
    result = fhe.zeros((N,))
    carry = 0

    addition = x + y
    for i in range(N):
        addition_and_carry = addition[i] + carry
        carry = addition_and_carry >> 1
        result[i] = addition_and_carry - (carry * 2)

    return result
import numpy as np
from concrete import fhe

N = 32

def add_bitmaps_naive(x, y):
    result = fhe.zeros((N,))
    carry = 0

    addition = x + y
    for i in range(N):
        addition_and_carry = addition[i] + carry
        carry = addition_and_carry >= 2
        result[i] = addition_and_carry % 2

    return result

def add_bitmaps_optimized(x, y):
    result = fhe.zeros((N,))
    carry = 0

    addition = x + y
    for i in range(N):
        addition_and_carry = addition[i] + carry
        carry = addition_and_carry >> 1
        result[i] = addition_and_carry - (carry * 2)

    return result

inputset = fhe.inputset(fhe.tensor[fhe.uint1, N], fhe.tensor[fhe.uint1, N])
for (name, implementation) in [("naive", add_bitmaps_naive), ("optimized", add_bitmaps_optimized)]:
    compiler = fhe.Compiler(implementation, {"x": "encrypted", "y": "encrypted"})
    circuit = compiler.compile(inputset)

    print(
        f"{name:>9} implementation "
        f"-> {int(circuit.programmable_bootstrap_count)} table lookups "
        f"-> {int(circuit.complexity):_} complexity"
    )
    naive implementation -> 63 table lookups -> 2_427_170_697 complexity
optimized implementation -> 32 table lookups -> 1_224_206_208 complexity

Error probability

This guide explains how setting p_error configuration option can affect the performance of Concrete circuits.

For example:

import numpy as np
from concrete import fhe

def f(x, y):
    return (x // 2) * (y // 3)

inputset = fhe.inputset(fhe.uint4, fhe.uint4)
for p_error in [(1 / 1_000_000), (1 / 100_000), (1 / 10_000), (1 / 1_000), (1 / 100)]:
    compiler = fhe.Compiler(f, {"x": "encrypted", "y": "encrypted"})
    circuit = compiler.compile(inputset, p_error=p_error)
    print(f"p_error of {p_error:.6f} -> {int(circuit.complexity):_} complexity")

This prints:

p_error of 0.000001 -> 294_773_524 complexity
p_error of 0.000010 -> 286_577_520 complexity
p_error of 0.000100 -> 275_887_080 complexity
p_error of 0.001000 -> 265_196_640 complexity
p_error of 0.010000 -> 184_144_972 complexity

Round/truncating

This guide teaches how to improve the execution time of Concrete circuits by using some special operations that reduce the bit width of the input of the table lookup.

For example the following code:

import numpy as np
from concrete import fhe

inputset = fhe.inputset(fhe.uint10)
for lsbs_to_remove in range(0, 10):
    def f(x):
        return fhe.round_bit_pattern(x, lsbs_to_remove) // 2

    compiler = fhe.Compiler(f, {"x": "encrypted"})
    circuit = compiler.compile(inputset)

    print(f"{lsbs_to_remove=} -> {int(circuit.complexity):>13_} complexity")

prints:

lsbs_to_remove=0 -> 9_134_406_574 complexity
lsbs_to_remove=1 -> 3_209_430_092 complexity
lsbs_to_remove=2 -> 1_536_476_735 complexity
lsbs_to_remove=3 -> 1_588_749_586 complexity
lsbs_to_remove=4 ->   848_133_081 complexity
lsbs_to_remove=5 ->   525_987_801 complexity
lsbs_to_remove=6 ->   358_276_023 complexity
lsbs_to_remove=7 ->   373_311_341 complexity
lsbs_to_remove=8 ->   400_596_351 complexity
lsbs_to_remove=9 ->   438_681_996 complexity

Adjusting table lookup error probability is discussed extensively in section. The idea is to sacrifice exactness to gain performance.

There are two extensions which can reduce the bit width of the table lookup input, and , which can improve performance by sacrificing exactness.

fhe.round_bit_pattern(...)
fhe.truncate_bit_pattern(...)

Dataflow parallelism

This guide explains dataflow parallelism and how it can improve the execution time of Concrete circuits.

Dataflow parallelism is particularly useful when the circuit performs computations that are neither completely independent (such as loop/doall parallelism) nor fully dependent (e.g. sequential, non-parallelizable code). In such cases dataflow tasks can execute as soon as their inputs are available and thus minimizing over-synchronization.

Without dataflow parallelism, circuit is executed operation by operation, like an imperative language. If the operations themselves are not tensorized, loop parallelism would not be utilized and the entire execution would happen in a single thread. Dataflow parallelism changes this by analyzing the operations and their dependencies within the circuit to determine what can be done in parallel and what cannot. Then it distributes the tasks that can be done in parallel to different threads.

For example:

import time

import numpy as np
from concrete import fhe

def f(x, y, z):
    # normally, you'd use fhe.array to construct a concrete tensor
    # but for this example, we just create a simple numpy array
    # so the matrix multiplication can happen on a cellular level
    a = np.array([[x, y], [z, 2]])
    b = np.array([[1, x], [z, y]])
    return fhe.array(a @ b)

inputset = fhe.inputset(fhe.uint3, fhe.uint3, fhe.uint3)

for dataflow_parallelize in [False, True]:
    compiler = fhe.Compiler(f, {"x": "encrypted", "y": "encrypted", "z": "encrypted"})
    circuit = compiler.compile(inputset, dataflow_parallelize=dataflow_parallelize)

    circuit.keygen()
    for sample in inputset[:3]:  # warmup
        circuit.encrypt_run_decrypt(*sample)

    timings = []
    for sample in inputset[3:13]:
        start = time.time()
        result = circuit.encrypt_run_decrypt(*sample)
        end = time.time()

        assert np.array_equal(result, f(*sample))
        timings.append(end - start)

    if not dataflow_parallelize:
        print(f"without dataflow parallelize -> {np.mean(timings):.03f}s")
    else:
        print(f"   with dataflow parallelize -> {np.mean(timings):.03f}s")

This prints:

without dataflow parallelize -> 0.609s
   with dataflow parallelize -> 0.414s

The reason for that is:

// this is the generated MLIR for the circuit
// without dataflow, every single line would be executed one after the other

module {
  func.func @main(%arg0: !FHE.eint<7>, %arg1: !FHE.eint<7>, %arg2: !FHE.eint<7>) -> tensor<2x2x!FHE.eint<7>> {
  
    // but if you look closely, you can see that this multiplication
    %c1_i2 = arith.constant 1 : i2
    %0 = "FHE.mul_eint_int"(%arg0, %c1_i2) : (!FHE.eint<7>, i2) -> !FHE.eint<7>
    
    // is completely independent of this one, so dataflow makes them run in parallel
    %1 = "FHE.mul_eint"(%arg1, %arg2) : (!FHE.eint<7>, !FHE.eint<7>) -> !FHE.eint<7>
    
    // however, this addition needs the first two operations
    // so dataflow waits until both are done before performing this one
    %2 = "FHE.add_eint"(%0, %1) : (!FHE.eint<7>, !FHE.eint<7>) -> !FHE.eint<7>
    
    // lastly, this multiplication is completely independent from the first three operations
    // so its execution starts in parallel when execution starts with dataflow
    %3 = "FHE.mul_eint"(%arg0, %arg0) : (!FHE.eint<7>, !FHE.eint<7>) -> !FHE.eint<7>
    
    // similar logic can be applied to the remaining operations...
    %4 = "FHE.mul_eint"(%arg1, %arg1) : (!FHE.eint<7>, !FHE.eint<7>) -> !FHE.eint<7>
    %5 = "FHE.add_eint"(%3, %4) : (!FHE.eint<7>, !FHE.eint<7>) -> !FHE.eint<7>
    %6 = "FHE.mul_eint_int"(%arg2, %c1_i2) : (!FHE.eint<7>, i2) -> !FHE.eint<7>
    %c2_i3 = arith.constant 2 : i3
    %7 = "FHE.mul_eint_int"(%arg2, %c2_i3) : (!FHE.eint<7>, i3) -> !FHE.eint<7>
    %8 = "FHE.add_eint"(%6, %7) : (!FHE.eint<7>, !FHE.eint<7>) -> !FHE.eint<7>
    %9 = "FHE.mul_eint"(%arg2, %arg0) : (!FHE.eint<7>, !FHE.eint<7>) -> !FHE.eint<7>
    %10 = "FHE.mul_eint_int"(%arg1, %c2_i3) : (!FHE.eint<7>, i3) -> !FHE.eint<7>
    %11 = "FHE.add_eint"(%9, %10) : (!FHE.eint<7>, !FHE.eint<7>) -> !FHE.eint<7>
    %from_elements = tensor.from_elements %2, %5, %8, %11 : tensor<2x2x!FHE.eint<7>>
    return %from_elements : tensor<2x2x!FHE.eint<7>>
    
  }
}

To summarize, dataflow analyzes the circuit to determine which parts of the circuit can be run at the same time, and tries to run as many operations as possible in parallel.

When the circuit is tensorized, dataflow might slow execution down since the tensor operations already use multiple threads and adding dataflow on top creates congestion in the CPU between the HPX (dataflow parallelism runtime) and OpenMP (loop parallelism runtime). So try both before deciding on whether to use dataflow or not.

Table lookup exactness

Table Lookups advanced

One of the most common operations in Concrete are Table Lookups (TLUs). All operations except addition, subtraction, multiplication with non-encrypted values, tensor manipulation operations, and a few operations built with those primitive operations (e.g. matmul, conv) are converted to Table Lookups under the hood.

Table Lookups are very flexible. They allow Concrete to support many operations, but they are expensive. The exact cost depends on many variables (hardware used, error probability, etc.), but they are always much more expensive compared to other operations. You should try to avoid them as much as possible. It's not always possible to avoid them completely, but you might remove the number of TLUs or replace some of them with other primitive operations.

Concrete automatically parallelizes TLUs if they are applied to tensors.

Direct table lookup

Concrete provides a LookupTable class to create your own tables and apply them in your circuits.

LookupTables can have any number of elements. Let's call the number of elements N. As long as the lookup variable is within the range [-N, N), the Table Lookup is valid.

If you go outside of this range, you will receive the following error:

IndexError: index 10 is out of bounds for axis 0 with size 6

With scalars.

You can create the lookup table using a list of integers and apply it using indexing:

from concrete import fhe

table = fhe.LookupTable([2, -1, 3, 0])

@fhe.compiler({"x": "encrypted"})
def f(x):
    return table[x]

inputset = range(4)
circuit = f.compile(inputset)

assert circuit.encrypt_run_decrypt(0) == table[0] == 2
assert circuit.encrypt_run_decrypt(1) == table[1] == -1
assert circuit.encrypt_run_decrypt(2) == table[2] == 3
assert circuit.encrypt_run_decrypt(3) == table[3] == 0

With tensors.

When you apply a table lookup to a tensor, the scalar table lookup is applied to each element of the tensor:

from concrete import fhe
import numpy as np

table = fhe.LookupTable([2, -1, 3, 0])

@fhe.compiler({"x": "encrypted"})
def f(x):
    return table[x]

inputset = [np.random.randint(0, 4, size=(2, 3)) for _ in range(10)]
circuit = f.compile(inputset)

sample = [
    [0, 1, 3],
    [2, 3, 1],
]
expected_output = [
    [2, -1, 0],
    [3, 0, -1],
]
actual_output = circuit.encrypt_run_decrypt(np.array(sample))

for i in range(2):
    for j in range(3):
        assert actual_output[i][j] == expected_output[i][j] == table[sample[i][j]]

With negative values.

LookupTable mimics array indexing in Python, which means if the lookup variable is negative, the table is looked up from the back:

from concrete import fhe

table = fhe.LookupTable([2, -1, 3, 0])

@fhe.compiler({"x": "encrypted"})
def f(x):
    return table[-x]

inputset = range(1, 5)
circuit = f.compile(inputset)

assert circuit.encrypt_run_decrypt(1) == table[-1] == 0
assert circuit.encrypt_run_decrypt(2) == table[-2] == 3
assert circuit.encrypt_run_decrypt(3) == table[-3] == -1
assert circuit.encrypt_run_decrypt(4) == table[-4] == 2

Direct multi-table lookup

If you want to apply a different lookup table to each element of a tensor, you can have a LookupTable of LookupTables:

from concrete import fhe
import numpy as np

squared = fhe.LookupTable([i ** 2 for i in range(4)])
cubed = fhe.LookupTable([i ** 3 for i in range(4)])

table = fhe.LookupTable([
    [squared, cubed],
    [squared, cubed],
    [squared, cubed],
])

@fhe.compiler({"x": "encrypted"})
def f(x):
    return table[x]

inputset = [np.random.randint(0, 4, size=(3, 2)) for _ in range(10)]
circuit = f.compile(inputset)

sample = [
    [0, 1],
    [2, 3],
    [3, 0],
]
expected_output = [
    [0, 1],
    [4, 27],
    [9, 0]
]
actual_output = circuit.encrypt_run_decrypt(np.array(sample))

for i in range(3):
    for j in range(2):
        if j == 0:
            assert actual_output[i][j] == expected_output[i][j] == squared[sample[i][j]]
        else:
            assert actual_output[i][j] == expected_output[i][j] == cubed[sample[i][j]]

In this example, we applied a squared table to the first column and a cubed table to the second column.

Fused table lookup

Concrete tries to fuse some operations into table lookups automatically so that lookup tables don't need to be created manually:

from concrete import fhe
import numpy as np

@fhe.compiler({"x": "encrypted"})
def f(x):
    return (42 * np.sin(x)).astype(np.int64) // 10

inputset = range(8)
circuit = f.compile(inputset)

for x in range(8):
    assert circuit.encrypt_run_decrypt(x) == f(x)

All lookup tables need to be from integers to integers. So, without .astype(np.int64), Concrete will not be able to fuse.

The function is first traced into:

Concrete then fuses appropriate nodes:

Fusing makes the code more readable and easier to modify, so try to utilize it over manual LookupTables as much as possible.

Using automatically created table lookup

Table lookup exactness

TLUs are performed with an FHE operation called Programmable Bootstrapping (PBS). PBSs have a certain probability of error: when these errors happen, it results in inaccurate results.

Let's say you have the table:

lut = [0, 1, 4, 9, 16, 25, 36, 49, 64]

And you perform a Table Lookup using 4. The result you should get is lut[4] = 16, but because of the possibility of error, you could get any other value in the table.

The probability of this error can be configured through the p_error and global_p_error configuration options. The difference between these two options is that, p_error is for individual TLUs but global_p_error is for the whole circuit.

If you set p_error to 0.01, for example, it means every TLU in the circuit will have a 99% chance (or more) of being exact. If there is a single TLU in the circuit, it corresponds to global_p_error = 0.01 as well. But if we have 2 TLUs, then global_p_error would be higher: that's 1 - (0.99 * 0.99) ~= 0.02 = 2%.

If you set global_p_error to 0.01, the whole circuit will have at most 1% probability of error, no matter how many Table Lookups are included (which means that p_error will be smaller than 0.01 if there are more than a single TLU).

If you set both of them, both will be satisfied. Essentially, the stricter one will be used.

By default, both p_error and global_p_error are set to None, which results in a global_p_error of 1 / 100_000 being used.

Configuring either of those variables impacts compilation and execution times (compilation, keys generation, circuit execution) and space requirements (size of the keys on disk and in memory). Lower error probabilities result in longer compilation and execution times and larger space requirements.

Table lookup performance

Shared key

This document explains how to set up a shared secret key between Concrete and TFHE-rs to perform computations.

In this scenario, a shared secret key will be used, while different keysets will be held for Concrete and TFHE-rs. There are two ways to generate keys, outlined with the following steps

From Concrete (1.1)

  1. Perform a classical key generation in Concrete, which generates a set of secret and public keys.

  2. Use this secret key to perform a partial key generation in TFHE-rs, starting from the shared secret key and generating the rest of the necessary keys.

From TFHE-rs (1.2)

  1. Perform a classical key generation in TFHE-rs, generating a single secret key and corresponding public keys.

  2. Use the secret key from TFHE-rs to perform a partial keygen in Concrete.

While TFHE-rs does use a single secret key, Concrete may generate more than one, but only one of these should be corresponding to the TFHE-rs key. The API does hide this detail, but will often ask you to provide the position of a given input/output. This will be used to infer which secret key should be used.

After the key generation is complete and we have both keysets, we can perform computations, encryption, and decryption on both ends.

Setup and configuration

In short, we first determine a suitable set of parameters from TFHE-rs and then apply them in Concrete. This ensures that the ciphertexts generated in both systems will be compatible by using the same cryptographic parameters.

from functools import partial
from concrete.fhe import tfhers

tfhers_params = tfhers.CryptoParams(
    lwe_dimension=909,
    glwe_dimension=1,
    polynomial_size=4096,
    pbs_base_log=15,
    pbs_level=2,
    lwe_noise_distribution=9.743962418842052e-07,
    glwe_noise_distribution=2.168404344971009e-19,
    encryption_key_choice=tfhers.EncryptionKeyChoice.BIG,
)
# creating a TFHE-rs ciphertext type with crypto and encoding params
tfhers_type = tfhers.TFHERSIntegerType(
    is_signed=False,
    bit_width=8,
    carry_width=3,
    msg_width=2,
    params=tfhers_params,
)
# this partial will help us create TFHERSInteger with the given type instead of calling
# tfhers.TFHERSInteger(tfhers_type, value) every time
tfhers_int = partial(tfhers.TFHERSInteger, tfhers_type)

Defining the circuit and compiling

We will now define a simple modular addition function. This function takes TFHE-rs inputs, converts them to Concrete format (to_native), runs a computation, and then converts them back to TFHE-rs. The circuit below is a common example that takes and produces TFHE-rs ciphertexts. However, there are other scenarios where you might not convert back to TFHE-rs, or you might convert to a different type than the input. Another possibility is to take one native ciphertext and one TFHE-rs ciphertext.

def compute(tfhers_x, tfhers_y):
    ####### TFHE-rs to Concrete #########

    # x and y are supposed to be TFHE-rs values.
    # to_native will use type information from x and y to do
    # a correct conversion from TFHE-rs to Concrete
    concrete_x = tfhers.to_native(tfhers_x)
    concrete_y = tfhers.to_native(tfhers_y)
    ####### TFHE-rs to Concrete #########

    ####### Concrete Computation ########
    concrete_res = (concrete_x + concrete_y) % 213
    ####### Concrete Computation ########

    ####### Concrete to TFHE-rs #########
    tfhers_res = tfhers.from_native(
        concrete_res, tfhers_type
    )  # we have to specify the type we want to convert to
    ####### Concrete to TFHE-rs #########
    return tfhers_res

We can compile the circuit as usual.

compiler = fhe.Compiler(compute, {"tfhers_x": "encrypted", "tfhers_y": "encrypted"})
inputset = [(tfhers_int(120), tfhers_int(120))]
circuit = compiler.compile(inputset)

You could optionally try the full execution in Concrete

# encode/encrypt
encrypted_x, encrypted_y = circuit.encrypt(tfhers_type.encode(7), tfhers_type.encode(9))
# run
encrypted_result = circuit.run(encrypted_x, encrypted_y)
# decrypt
result = circuit.decrypt(encrypted_result)
# decode
decoded = tfhers_type.decode(result)

Connecting Concrete and TFHE-rs

We are going to create a TFHE-rs bridge that facilitates the seamless transfer of ciphertexts and keys between Concrete and TFHE-rs.

tfhers_bridge = tfhers.new_bridge(circuit=circuit)

Key generation

In order to establish a shared secret key between Concrete and TFHE-rs, there are two possible methods for key generation. The first method (use case 1.1) involves generating the Concrete keyset first and then using the shared secret key in TFHE-rs to partially generate the TFHE-rs keyset. The second method (use case 1.2) involves doing the opposite. You should only run one of the two following methods.

Remember that one key generation need to be a partial keygen, to be sure that there is a unique and common secret key.

Parameters used in TFHE-rs must be the same as the ones used in Concrete.

KeyGen starts in Concrete (use case 1.1)

First, we generate the Concrete keyset and then serialize the shared secret key that will be used to encrypt the inputs. In our case, this shared secret key is the same for all inputs and outputs.

# generate all keys from scratch (not using initial secret keys)
circuit.keygen()
# since both inputs have the same type, they will use the same secret key, thus we serialize it once
secret_key: bytes = tfhers_bridge.serialize_input_secret_key(input_idx=0)
# we write it to a file to be used by TFHE-rs
with open("secret_key_from_concrete", "wb") as f:
    f.write(secret_key)
use tfhe::core_crypto::prelude::LweSecretKey;
use tfhe::ClientKey;

/// ...

let lwe_sk: LweSecretKey<Vec<u64>> = load_lwe_sk("secret_key_from_concrete");
let shortint_key =
    tfhe::shortint::ClientKey::try_from_lwe_encryption_key(
        lwe_sk,
        // Concrete uses this parameters to define the TFHE-rs ciphertext type
        tfhe::shortint::prelude::PARAM_MESSAGE_2_CARRY_3_KS_PBS
    ).unwrap();
let client_key = ClientKey::from_raw_parts(shortint_key.into(), None, None);
let server_key = client_key.generate_server_key();

KeyGen starts in TFHE-rs (use case 1.2)

First, we generate the TFHE-rs keyset and then serialize the shared secret key that will be used to encrypt the inputs

use tfhe::{prelude::*, ConfigBuilder};
use tfhe::generate_keys;

/// ...

// Concrete uses this parameters to define the TFHE-rs ciphertext type
let config = ConfigBuilder::with_custom_parameters(
        tfhe::shortint::prelude::PARAM_MESSAGE_2_CARRY_3_KS_PBS,
    )
.build();

let (client_key, server_key) = generate_keys(config);
let (integer_ck, _, _) = client_key.clone().into_raw_parts();
let shortint_ck = integer_ck.into_raw_parts();
let (glwe_secret_key, _, _) = shortint_ck.into_raw_parts();
let lwe_secret_key = glwe_secret_key.into_lwe_secret_key();

save_lwe_sk(lwe_secret_key, "secret_key_from_tfhers");

Next, we generate a Concrete keyset using the shared secret key from TFHE-rs.

# this was generated from TFHE-rs
with open("secret_key_from_tfhers", "rb") as f:
    sk_buff = f.read()
# maps input indices to their secret key
input_idx_to_key = {0: sk_buff, 1: sk_buff}
# we do a Concrete keygen starting with an initial set of secret keys
tfhers_bridge.keygen_with_initial_keys(input_idx_to_key_buffer=input_idx_to_key)

Using ciphertexts

At this point, we have everything necessary to encrypt, compute, and decrypt on both Concrete and TFHE-rs. Whether you began key generation in Concrete or in TFHE-rs, the keysets on both sides are compatible.

Now, we'll walk through an encryption and computation process in TFHE-rs, transition to Concrete to run the circuit, and then return to TFHE-rs for decryption.

let x = FheUint8::encrypt(162, &client_key);
let y = FheUint8::encrypt(73, &client_key);
// we will add two encrypted integers in TFHE-rs to showcase
// that we are doing some part of the computation in TFHE-rs
// and the rest in Concrete
let z = FheUint8::encrypt(9, &client_key);
y += z;

save_fheuint8(x, "tfhers_x");
save_fheuint8(y, "tfhers_y");

Next, we can load these ciphertexts in Concrete and then run our compiled circuit as usual.

with open("tfhers_x", "rb") as f:
    buff_x = f.read()
with open("tfhers_y", "rb") as f:
    buff_y = f.read()
tfhers_uint8_x = tfhers_bridge.import_value(buff_x, input_idx=0)
tfhers_uint8_y = tfhers_bridge.import_value(buff_y, input_idx=1)

encrypted_result = circuit.run(tfhers_uint8_x, tfhers_uint8_y)

Finally, we can decrypt and decode in Concrete

result = circuit.decrypt(encrypted_result)
decoded = tfhers_type.decode(result)

assert decoded == (162 + 73 + 9) % 213

... or export it to TFHE-rs for computation/decryption

buff_out = tfhers_bridge.export_value(encrypted_result, output_idx=0)
# write it to file
with open("tfhers_out", "wb") as f:
    f.write(buff_out)
let fheuint = load_fheuint8("tfhers_out");
// you can do computation before decryption as well
let result: u8 = fheuint.decrypt(&client_key);

assert!(result == (162 + 73 + 9) % 213)

Compiler workflow

This document explains the different passes happening in the compilation process, from the Concrete Python frontend to the Concrete MLIR compiler.

The next step in compilation is transforming the computation graph. There are many transformations we perform, and these are discussed in their own sections. The result of a transformation is another computation graph.

After transformations are applied, we need to determine the bounds (i.e., the minimum and the maximum values) of each intermediate node. This is required because FHE allows limited precision for computations. Measuring these bounds helps determine the required precision for the function.

The frontend is almost done at this stage and only needs to transform the computation graph to equivalent MLIR code. Once the MLIR is generated, our Compiler backend takes over. Any other frontend wishing to use the Compiler needs to plugin at this stage.

Tracing

We start with a Python function f, such as this one:

def f(x):
    return (2 * x) + 3

The goal of tracing is to create the following computation graph without requiring any change from the user.

(Note that the edge labels are for non-commutative operations. To give an example, a subtraction node represents (predecessor with edge label 0) - (predecessor with edge label 1))

To do this, we make use of Tracers, which are objects that record the operation performed during their creation. We create a Tracer for each argument of the function and call the function with those Tracers. Tracers make use of the operator overloading feature of Python to achieve their goal:

def f(x, y):
    return x + 2 * y

x = Tracer(computation=Input("x"))
y = Tracer(computation=Input("y"))

resulting_tracer = f(x, y)

2 * y will be performed first, and * is overloaded for Tracer to return another tracer: Tracer(computation=Multiply(Constant(2), self.computation)), which is equal to Tracer(computation=Multiply(Constant(2), Input("y"))).

x + (2 * y) will be performed next, and + is overloaded for Tracer to return another tracer: Tracer(computation=Add(self.computation, (2 * y).computation)), which is equal to Tracer(computation=Add(Input("x"), Multiply(Constant(2), Input("y"))).

In the end, we will have output tracers that can be used to create the computation graph. The implementation is a bit more complex than this, but the idea is the same.

Tracing is also responsible for indicating whether the values in the node would be encrypted or not. The rule for that is: if a node has an encrypted predecessor, it is encrypted as well.

Topological transforms

The goal of topological transforms is to make more functions compilable.

With the current version of Concrete, floating-point inputs and floating-point outputs are not supported. However, if the floating-point operations are intermediate operations, they can sometimes be fused into a single table lookup from integer to integer, thanks to some specific transforms.

Let's take a closer look at the transforms we can currently perform.

Fusing.

Bounds measurement

Given a computation graph, the goal of the bounds measurement step is to assign the minimal data type to each node in the graph.

If we have an encrypted input that is always between 0 and 10, we should assign the type EncryptedScalar<uint4> to the node of this input as EncryptedScalar<uint4>. This is the minimal encrypted integer that supports all values between 0 and 10.

If there were negative values in the range, we could have used intX instead of uintX.

Bounds measurement is necessary because FHE supports limited precision, and we don't want unexpected behaviour while evaluating the compiled functions.

Let's take a closer look at how we perform bounds measurement.

Inputset evaluation

This is a simple approach that requires an inputset to be provided by the user.

The inputset is not to be confused with the dataset, which is classical in ML, as it doesn't require labels. Rather, the inputset is a set of values which are typical inputs of the function.

The idea is to evaluate each input in the inputset and record the result of each operation in the computation graph. Then we compare the evaluation results with the current minimum/maximum values of each node and update the minimum/maximum accordingly. After the entire inputset is evaluated, we assign a data type to each node using the minimum and maximum values it contains.

Here is an example, given this computation graph where x is encrypted:

and this inputset:

[2, 3, 1]

Evaluation result of 2:

  • x: 2

  • 2: 2

  • *: 4

  • 3: 3

  • +: 7

New bounds:

  • x: [2, 2]

  • 2: [2, 2]

  • *: [4, 4]

  • 3: [3, 3]

  • +: [7, 7]

Evaluation result of 3:

  • x: 3

  • 2: 2

  • *: 6

  • 3: 3

  • +: 9

New bounds:

  • x: [2, 3]

  • 2: [2, 2]

  • *: [4, 6]

  • 3: [3, 3]

  • +: [7, 9]

Evaluation result of 1:

  • x: 1

  • 2: 2

  • *: 2

  • 3: 3

  • +: 5

New bounds:

  • x: [1, 3]

  • 2: [2, 2]

  • *: [2, 6]

  • 3: [3, 3]

  • +: [5, 9]

Assigned data types:

  • x: EncryptedScalar<uint2>

  • 2: ClearScalar<uint2>

  • *: EncryptedScalar<uint3>

  • 3: ClearScalar<uint2>

  • +: EncryptedScalar<uint4>

MLIR Compiler Passes

We describe below some of the main passes in the compilation pipeline.

FHE to TFHE

TFHE Parameterization

TFHE Parameterization takes care of introducing the chosen parameters in the Intermediate Representation (IR). After this pass, you should be able to see the dimension of ciphertexts, as well as other parameters in the IR.

TFHE to Concrete

This pass lowers TFHE operations to low level operations that are closer to the backend implementation, working on tensors and memory buffers (after a bufferization pass).

Concrete to LLVM

This pass lowers everything to LLVM-IR in order to generate the final binary.

See all tutorials

Start here

Go further

Code examples on GitHub

Blog tutorials

Video tutorials

Zama 5-Question Developer Survey

Rounding

This document details the concept of rounding, and how it is used in Concrete to make some FHE computations especially faster.

Table lookups have a strict constraint on the number of bits they support. This can be limiting, especially if you don't need exact precision. As well as this, using larger bit-widths leads to slower table lookups.

To overcome these issues, rounded table lookups are introduced. This operation provides a way to round the least significant bits of a large integer and then apply the table lookup on the resulting (smaller) value.

Imagine you have a 5-bit value, but you want to have a 3-bit table lookup. You can call fhe.round_bit_pattern(input, lsbs_to_remove=2) and use the 3-bit value you receive as input to the table lookup.

Let's see how rounding works in practice:

prints:

and displays:

If the rounded number is one of the last 2**(lsbs_to_remove - 1) numbers in the input range [0, 2**original_bit_width), an overflow will happen.

By default, if an overflow is encountered during inputset evaluation, bit-widths will be adjusted accordingly. This results in a loss of speed, but ensures accuracy.

You can turn this overflow protection off (e.g., for performance) by using fhe.round_bit_pattern(..., overflow_protection=False). However, this could lead to unexpected behavior at runtime.

Now, let's see how rounding can be used in FHE.

prints:

These speed-ups can vary from system to system.

The reason why the speed-up is not increasing with lsbs_to_remove is because the rounding operation itself has a cost: each bit removal is a PBS. Therefore, if a lot of bits are removed, rounding itself could take longer than the bigger TLU which is evaluated afterwards.

and displays:

Feel free to disable overflow protection and see what happens.

Auto Rounders

Rounding is very useful but, in some cases, you don't know how many bits your input contains, so it's not reliable to specify lsbs_to_remove manually. For this reason, the AutoRounder class is introduced.

AutoRounder allows you to set how many of the most significant bits to keep, but they need to be adjusted using an inputset to determine how many of the least significant bits to remove. This can be done manually using fhe.AutoRounder.adjust(function, inputset), or by setting auto_adjust_rounders configuration to True during compilation.

Here is how auto rounders can be used in FHE:

prints:

and displays:

AutoRounders should be defined outside the function that is being compiled. They are used to store the result of the adjustment process, so they shouldn't be created each time the function is called. Furthermore, each AutoRounder should be used with exactly one round_bit_pattern call.

Exactness

One use of rounding is doing faster computation by ignoring the lower significant bits. For this usage, you can even get faster results if you accept the rounding it-self to be slightly inexact. The speedup is usually around 2x-3x but can be higher for big precision reduction. This also enable higher precisions values that are not possible otherwise.

You can turn on this mode either globally on the configuration:

or on/off locally:

In approximate mode the rounding threshold up or down is not perfectly centered: The off-centering is:

  • is bounded, i.e. at worst an off-by-one on the reduced precision value compared to the exact result,

  • is pseudo-random, i.e. it will be different on each call,

  • almost symmetrically distributed,

  • depends on cryptographic properties like the encryption mask, the encryption noise and the crypto-parameters.

Approximate rounding features

With approximate rounding, you can enable an approximate clipping to get further improve performance in the case of overflow handling. Approximate clipping enable to discard the extra bit of overflow protection bit in the successor TLU. For consistency a logical clipping is available when this optimization is not suitable.

Logical clipping

When fast approximate clipping is not suitable (i.e. slower), it's better to apply logical clipping for consistency and better resilience to code change. It has no extra cost since it's fuzed with the successor TLU.

Approximate clipping

This set the first precision where approximate clipping is enabled, starting from this precision, an extra small precision TLU is introduced to safely remove the extra precision bit used to contain overflow. This way the successor TLU is faster. E.g. for a rounding to 7bits, that finishes to a TLU of 8bits due to overflow, forcing to use a TLU of 7bits is 3x faster.

This document details the management of Table Lookups(TLU) within Concrete for advanced usage. For a simpler guide, refer to the .

We refer the users to for explanations about fhe.univariate(function) and fhe.multivariate(function) features, which are convenient ways to use automatically created table lookup.

Feel free to play with these configuration options to pick the one best suited for your needs! See to learn how you can set a custom p_error and/or global_p_error.

PBSs are very expensive, in terms of computations. Fortunately, it is sometimes possible to replace PBS by , or even . These TLUs have a slightly different semantic, but are very useful in cases like machine learning for more efficiency without drop of accuracy.

The first step is to define the TFHE-rs ciphertext type that will be used in the computation (see ). This includes specifying both cryptographic and encoding parameters. TFHE-rs provides a pre-computed list of recommended parameters, which we will use to avoid manual selection. You can find the parameters used in this guide .

Next, we generate client and server keys in TFHE-rs using the shared secret key from Concrete. We will cover serialization in a , so there's no need to worry about how we loaded the secret key. For now, we will consider having 4 functions (save_lwe_sk, save_fheuint8, load_lwe_sk, load_fheuint8) which respectively save/load an LWE secret key and an FheUint8 to/from a given path.

First, we do encryption and a simple addition in TFHE-rs. For more information on how to save ciphertexts, refer to .

Full working example can be found .

There are two main entry points to the Concrete Compiler. The first is to use the Concrete Python frontend. The second is to use the Compiler directly, which takes as input. Concrete Python is more high level and uses the Compiler under the hood.

Compilation begins in the frontend with tracing to get an easy-to-manipulate representation of the function. We call this representation a Computation Graph, which is a Directed Acyclic Graph (DAG) containing nodes representing computations done in the function. Working with graphs is useful because they have been studied extensively and there are a lot of available algorithms to manipulate them. Internally, we use , which is an excellent graph library for Python.

The Compiler takes MLIR code that makes use of both the FHE and FHELinalg for scalar and tensor operations respectively.

Compilation then ends with a series of that generates a native binary which contains executable code. Crypto parameters are generated along the way as well.

We have allocated a whole new chapter to explaining fusing. You can find it .

This pass converts high level operations which are not crypto specific to lower level operations from the TFHE scheme. Ciphertexts get introduced in the code as well. TFHE operations and ciphertexts require some parameters which need to be chosen, and the pass does just that.

- November 2023

- March 2023

- May 2024

- May 2024

- February 2024

- October 2023

- October 2023

- July 2023

We want to hear from you! Take 1 minute to share your thoughts and helping us enhance our documentation and libraries. 👉 to participate.

Table Lookup Basics
this page
How to Configure
rounded PBS
truncate PBS
approximate PBS
Overview
here
later section
Serialization
here
MLIR
networkx
dialects
here
Part I - Concrete, Zama's Fully Homomorphic Encryption Compiler
Part II - The Architecture of Concrete, Zama's Fully Homomorphic Encryption Compiler Leveraging MLIR
Floating points
Key value database
SHA-256
Game of Life
XOR distance
SHA1 with Modules
Levenshtein distance with Modules
Inventory Matching System
Private Information Retrieval
TFHE-rs Compatibility
The Encrypted Game of Life in Python Using Concrete
Encrypted Key-value Database Using Homomorphic Encryption
Compute an XOR distance in FHE using Concrete
Speed up neural networks with approximate rounding using Concrete
Compile composable functions with Concrete
How to use dynamic table look-ups using Concrete
Dive into Concrete - Zama's Fully Homomorphic Encryption Compiler
How To Get Started With Concrete - Zama's Fully Homomorphic Encryption Compiler
Click here
passes
TFHE Parameterization
import matplotlib.pyplot as plt
import numpy as np
from concrete import fhe

original_bit_width = 5
lsbs_to_remove = 2

assert 0 < lsbs_to_remove < original_bit_width

original_values = list(range(2**original_bit_width))
rounded_values = [
    fhe.round_bit_pattern(value, lsbs_to_remove)
    for value in original_values
]

previous_rounded = rounded_values[0]
for original, rounded in zip(original_values, rounded_values):
    if rounded != previous_rounded:
        previous_rounded = rounded
        print()

    original_binary = np.binary_repr(original, width=(original_bit_width + 1))
    rounded_binary = np.binary_repr(rounded, width=(original_bit_width + 1))

    print(
        f"{original:2} = 0b_{original_binary[:-lsbs_to_remove]}[{original_binary[-lsbs_to_remove:]}] "
        f"=> "
        f"0b_{rounded_binary[:-lsbs_to_remove]}[{rounded_binary[-lsbs_to_remove:]}] = {rounded}"
    )

fig = plt.figure()
ax = fig.add_subplot()

plt.plot(original_values, original_values, label="original", color="black")
plt.plot(original_values, rounded_values, label="rounded", color="green")
plt.legend()

ax.set_aspect("equal", adjustable="box")
plt.show()
 0 = 0b_0000[00] => 0b_0000[00] = 0
 1 = 0b_0000[01] => 0b_0000[00] = 0

 2 = 0b_0000[10] => 0b_0001[00] = 4
 3 = 0b_0000[11] => 0b_0001[00] = 4
 4 = 0b_0001[00] => 0b_0001[00] = 4
 5 = 0b_0001[01] => 0b_0001[00] = 4

 6 = 0b_0001[10] => 0b_0010[00] = 8
 7 = 0b_0001[11] => 0b_0010[00] = 8
 8 = 0b_0010[00] => 0b_0010[00] = 8
 9 = 0b_0010[01] => 0b_0010[00] = 8

10 = 0b_0010[10] => 0b_0011[00] = 12
11 = 0b_0010[11] => 0b_0011[00] = 12
12 = 0b_0011[00] => 0b_0011[00] = 12
13 = 0b_0011[01] => 0b_0011[00] = 12

14 = 0b_0011[10] => 0b_0100[00] = 16
15 = 0b_0011[11] => 0b_0100[00] = 16
16 = 0b_0100[00] => 0b_0100[00] = 16
17 = 0b_0100[01] => 0b_0100[00] = 16

18 = 0b_0100[10] => 0b_0101[00] = 20
19 = 0b_0100[11] => 0b_0101[00] = 20
20 = 0b_0101[00] => 0b_0101[00] = 20
21 = 0b_0101[01] => 0b_0101[00] = 20

22 = 0b_0101[10] => 0b_0110[00] = 24
23 = 0b_0101[11] => 0b_0110[00] = 24
24 = 0b_0110[00] => 0b_0110[00] = 24
25 = 0b_0110[01] => 0b_0110[00] = 24

26 = 0b_0110[10] => 0b_0111[00] = 28
27 = 0b_0110[11] => 0b_0111[00] = 28
28 = 0b_0111[00] => 0b_0111[00] = 28
29 = 0b_0111[01] => 0b_0111[00] = 28

30 = 0b_0111[10] => 0b_1000[00] = 32
31 = 0b_0111[11] => 0b_1000[00] = 32
import itertools
import time

import matplotlib.pyplot as plt
import numpy as np
from concrete import fhe

configuration = fhe.Configuration(
    enable_unsafe_features=True,
    use_insecure_key_cache=True,
    insecure_key_cache_location=".keys",
    single_precision=False,
    parameter_selection_strategy=fhe.ParameterSelectionStrategy.MULTI,
)

input_bit_width = 6
input_range = np.array(range(2**input_bit_width))

timings = {}
results = {}

for lsbs_to_remove in range(input_bit_width):
    @fhe.compiler({"x": "encrypted"})
    def f(x):
        return fhe.round_bit_pattern(x, lsbs_to_remove) ** 2
    
    circuit = f.compile(inputset=[input_range], configuration=configuration)
    circuit.keygen()
    
    encrypted_sample = circuit.encrypt(input_range)
    start = time.time()
    encrypted_result = circuit.run(encrypted_sample)
    end = time.time()
    result = circuit.decrypt(encrypted_result)
    
    took = end - start
    
    timings[lsbs_to_remove] = took
    results[lsbs_to_remove] = result

number_of_figures = len(results)

columns = 1
for i in range(2, number_of_figures):
    if number_of_figures % i == 0:
        columns = i
rows = number_of_figures // columns

fig, axs = plt.subplots(rows, columns)
axs = axs.flatten()

baseline = timings[0]
for lsbs_to_remove in range(input_bit_width):
    timing = timings[lsbs_to_remove]
    speedup = baseline / timing
    print(f"lsbs_to_remove={lsbs_to_remove} => {speedup:.2f}x speedup")

    axs[lsbs_to_remove].set_title(f"lsbs_to_remove={lsbs_to_remove}")
    axs[lsbs_to_remove].plot(input_range, results[lsbs_to_remove])

plt.show()
lsbs_to_remove=0 => 1.00x speedup
lsbs_to_remove=1 => 1.20x speedup
lsbs_to_remove=2 => 2.17x speedup
lsbs_to_remove=3 => 3.75x speedup
lsbs_to_remove=4 => 2.64x speedup
lsbs_to_remove=5 => 2.61x speedup
import itertools
import time

import matplotlib.pyplot as plt
import numpy as np
from concrete import fhe

configuration = fhe.Configuration(
    enable_unsafe_features=True,
    use_insecure_key_cache=True,
    insecure_key_cache_location=".keys",
    single_precision=False,
    parameter_selection_strategy=fhe.ParameterSelectionStrategy.MULTI,
)

input_bit_width = 6
input_range = np.array(range(2**input_bit_width))

timings = {}
results = {}

for target_msbs in reversed(range(1, input_bit_width + 1)):
    rounder = fhe.AutoRounder(target_msbs)

    @fhe.compiler({"x": "encrypted"})
    def f(x):
        return fhe.round_bit_pattern(x, rounder) ** 2

    fhe.AutoRounder.adjust(f, inputset=[input_range])

    circuit = f.compile(inputset=[input_range], configuration=configuration)
    circuit.keygen()

    encrypted_sample = circuit.encrypt(input_range)
    start = time.time()
    encrypted_result = circuit.run(encrypted_sample)
    end = time.time()
    result = circuit.decrypt(encrypted_result)

    took = end - start

    timings[target_msbs] = took
    results[target_msbs] = result

number_of_figures = len(results)

columns = 1
for i in range(2, number_of_figures):
    if number_of_figures % i == 0:
        columns = i
rows = number_of_figures // columns

fig, axs = plt.subplots(rows, columns)
axs = axs.flatten()

baseline = timings[input_bit_width]
for i, target_msbs in enumerate(reversed(range(1, input_bit_width + 1))):
    timing = timings[target_msbs]
    speedup = baseline / timing
    print(f"target_msbs={target_msbs} => {speedup:.2f}x speedup")

    axs[i].set_title(f"target_msbs={target_msbs}")
    axs[i].plot(input_range, results[target_msbs])

plt.show()
target_msbs=6 => 1.00x speedup
target_msbs=5 => 1.22x speedup
target_msbs=4 => 1.95x speedup
target_msbs=3 => 3.11x speedup
target_msbs=2 => 2.23x speedup
target_msbs=1 => 2.34x speedup
configuration = fhe.Configuration(
    ...
    rounding_exactness=fhe.Exactness.APPROXIMATE
)
v = fhe.round_bit_pattern(v, lsbs_to_remove=2, exactness=fhe.Exactness.APPROXIMATE)
v = fhe.round_bit_pattern(v, lsbs_to_remove=2, exactness=fhe.Exactness.EXACT)
alt text
Cover

What is Concrete

Understand the basic concepts of the Concrete library.

Cover

Installation

Follow the step by step guide to install Concrete in your project

Cover

Quick start

See a full example of using Concrete to compute on encrypted data

Comparisons

This document describes how comparisons are managed in Concrete, typically 'equal', 'greater than', and so on. It covers different strategies to make the FHE computations faster, depending on the context.

Comparisons are not native operations in Concrete, so they need to be implemented using existing native operations (i.e., additions, clear multiplications, negations, table lookups). Concrete offers three different implementations for performing comparisons.

Chunked

This is the most general implementation that can be used in any situation. The idea is:

# (example below is for bit-width of 8 and chunk size of 4)

# extract chunks of lhs using table lookups
lhs_chunks = [lhs.bits[0:4], lhs.bits[4:8]]

# extract chunks of rhs using table lookups
rhs_chunks = [rhs.bits[0:4], rhs.bits[4:8]]

# pack chunks of lhs and rhs using clear multiplications and additions 
packed_chunks = []
for lhs_chunk, rhs_chunk in zip(lhs_chunks, rhs_chunks):
    shifted_lhs_chunk = lhs_chunk * 2**4  # (i.e., lhs_chunk << 4)
    packed_chunks.append(shifted_lhs_chunk + rhs_chunk)

# apply comparison table lookup to packed chunks
comparison_table = fhe.LookupTable([...])
chunk_comparisons = comparison_table[packed_chunks]

# reduce chunk comparisons to comparison of numbers
result = chunk_comparisons[0]
for chunk_comparison in chunk_comparisons[1:]:
    chunk_reduction_table = fhe.LookupTable([...])
    shifted_chunk_comparison= chunk_comparison * 2**2  # (i.e., lhs_chunk << 2)
    result = chunk_reduction_table[result + shifted_chunk_comparison]

Notes

  • Signed comparisons are more complex to explain, but they are supported!

  • The optimal chunk size is selected automatically to reduce the number of table lookups.

  • Chunked comparisons result in at least 5 and at most 13 table lookups.

  • It is used if no other implementation can be used.

  • == and != are using a different chunk comparison and reduction strategy with less table lookups.

Pros

  • Can be used with any integers.

Cons

  • Very expensive.

Example

import numpy as np
from concrete import fhe

def f(x, y):
    return x < y

inputset = [
    (np.random.randint(0, 2**4), np.random.randint(0, 2**4))
    for _ in range(100)
]

compiler = fhe.Compiler(f, {"x": "encrypted", "y": "encrypted"})
circuit = compiler.compile(inputset, show_mlir=True)

produces

module {
  func.func @main(%arg0: !FHE.eint<4>, %arg1: !FHE.eint<4>) -> !FHE.eint<1> {
  
    // extracting the first chunk of x, adjusted for shifting
    %cst = arith.constant dense<[0, 0, 0, 0, 4, 4, 4, 4, 8, 8, 8, 8, 12, 12, 12, 12]> : tensor<16xi64>
    %0 = "FHE.apply_lookup_table"(%arg0, %cst) : (!FHE.eint<4>, tensor<16xi64>) -> !FHE.eint<4>
    
    // extracting the first chunk of y
    %cst_0 = arith.constant dense<[0, 0, 0, 0, 1, 1, 1, 1, 2, 2, 2, 2, 3, 3, 3, 3]> : tensor<16xi64>
    %1 = "FHE.apply_lookup_table"(%arg1, %cst_0) : (!FHE.eint<4>, tensor<16xi64>) -> !FHE.eint<4>
    
    // packing first chunks
    %2 = "FHE.add_eint"(%0, %1) : (!FHE.eint<4>, !FHE.eint<4>) -> !FHE.eint<4>
    
    // comparing first chunks
    %cst_1 = arith.constant dense<[0, 1, 1, 1, 2, 0, 1, 1, 2, 2, 0, 1, 2, 2, 2, 0]> : tensor<16xi64>
    %3 = "FHE.apply_lookup_table"(%2, %cst_1) : (!FHE.eint<4>, tensor<16xi64>) -> !FHE.eint<4>
    
    // extracting the second chunk of x, adjusted for shifting
    %cst_2 = arith.constant dense<[0, 4, 8, 12, 0, 4, 8, 12, 0, 4, 8, 12, 0, 4, 8, 12]> : tensor<16xi64>
    %4 = "FHE.apply_lookup_table"(%arg0, %cst_2) : (!FHE.eint<4>, tensor<16xi64>) -> !FHE.eint<4>
    
    // extracting the second chunk of y
    %cst_3 = arith.constant dense<[0, 1, 2, 3, 0, 1, 2, 3, 0, 1, 2, 3, 0, 1, 2, 3]> : tensor<16xi64>
    %5 = "FHE.apply_lookup_table"(%arg1, %cst_3) : (!FHE.eint<4>, tensor<16xi64>) -> !FHE.eint<4>
    
    // packing second chunks
    %6 = "FHE.add_eint"(%4, %5) : (!FHE.eint<4>, !FHE.eint<4>) -> !FHE.eint<4>
    
    // comparing second chunks
    %cst_4 = arith.constant dense<[0, 4, 4, 4, 8, 0, 4, 4, 8, 8, 0, 4, 8, 8, 8, 0]> : tensor<16xi64>
    %7 = "FHE.apply_lookup_table"(%6, %cst_4) : (!FHE.eint<4>, tensor<16xi64>) -> !FHE.eint<4>
    
    // packing comparisons
    %8 = "FHE.add_eint"(%7, %3) : (!FHE.eint<4>, !FHE.eint<4>) -> !FHE.eint<4>
    
    // reducing comparisons to result
    %cst_5 = arith.constant dense<[0, 1, 0, 0, 1, 1, 0, 0, 0, 1, 0, 0, 0, 1, 0, 0]> : tensor<16xi64>
    %9 = "FHE.apply_lookup_table"(%8, %cst_5) : (!FHE.eint<4>, tensor<16xi64>) -> !FHE.eint<1>
    
    return %9 : !FHE.eint<1>
    
  }
}

Subtraction Trick

This implementation uses the fact that x [<,<=,==,!=,>=,>] y is equal to x - y [<,<=,==,!=,>=,>] 0, which is just a subtraction and a table lookup!

There are two major problems with this implementation:

  1. subtraction before the TLU requires up to 2 additional bits to avoid overflows (it is 1 in most cases).

  2. subtraction requires the same bit-width across operands.

What this means is if we are comparing uint3 and uint6, we need to convert both of them to uint7 in some way to do the subtraction and proceed with the TLU in 7-bits. There are 4 ways to achieve this behavior.

Requirements

  • (x - y).bit_width <= MAXIMUM_TLU_BIT_WIDTH

1. fhe.ComparisonStrategy.ONE_TLU_PROMOTED

This strategy makes sure that during bit-width assignment, both operands are assigned the same bit-width, and that bit-width contains at least the number of bits required to store x - y. The idea is:

comparison_lut = fhe.LookupTable([...])
result = comparison_lut[x_promoted_to_uint7 - y_promoted_to_uint7]

Pros

  • It will always result in a single table lookup.

Cons

  • It will increase the bit-width of both operands and lock them to each other across the whole circuit, which can result in significant slowdowns if the operands are used in other costly operations.

Example

import numpy as np
from concrete import fhe

configuration = fhe.Configuration(
    comparison_strategy_preference=fhe.ComparisonStrategy.ONE_TLU_PROMOTED,
)

def f(x, y):
    return x < y

inputset = [
    (np.random.randint(0, 2**4), np.random.randint(0, 2**4))
    for _ in range(100)
]

compiler = fhe.Compiler(f, {"x": "encrypted", "y": "encrypted"})
circuit = compiler.compile(inputset, configuration, show_mlir=True)

produces

module {
  // promotions          ............         ............
  func.func @main(%arg0: !FHE.eint<5>, %arg1: !FHE.eint<5>) -> !FHE.eint<1> {
    
    // subtraction
    %0 = "FHE.to_signed"(%arg0) : (!FHE.eint<5>) -> !FHE.esint<5>
    %1 = "FHE.to_signed"(%arg1) : (!FHE.eint<5>) -> !FHE.esint<5>
    %2 = "FHE.sub_eint"(%0, %1) : (!FHE.esint<5>, !FHE.esint<5>) -> !FHE.esint<5>
    
    // computing the result
    %cst = arith.constant dense<[0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1]> : tensor<32xi64>
    %3 = "FHE.apply_lookup_table"(%2, %cst) : (!FHE.esint<5>, tensor<32xi64>) -> !FHE.eint<1>
    
    return %3 : !FHE.eint<1>
    
  }
  
}

2. fhe.ComparisonStrategy.THREE_TLU_CASTED

This strategy will not put any constraint on bit-widths during bit-width assignment, instead operands are cast to a bit-width that can store x - y during runtime using table lookups. The idea is:

uint3_to_uint7_lut = fhe.LookupTable([...])
x_cast_to_uint7 = uint3_to_uint7_lut[x]

uint6_to_uint7_lut = fhe.LookupTable([...])
y_cast_to_uint7 = uint6_to_uint7_lut[y]

comparison_lut = fhe.LookupTable([...])
result = comparison_lut[x_cast_to_uint7 - y_cast_to_uint7]

Notes

  • It can result in a single table lookup, if x and y are assigned (because of other operations) the same bit-width and that bit-width can store x - y.

  • Alternatively, two table lookups can be used if only one of the operands is assigned a bit-width bigger than or equal to the bit width that can store x - y.

Pros

  • It will not put any constraints on the bit-widths of the operands, which is amazing if they are used in other costly operations.

  • It will result in at most 3 table lookups, which is still good.

Cons

  • If you are not doing anything else with the operands, or doing less costly operations compared to comparison, it will introduce up to two unnecessary table lookups and slow down execution compared to fhe.ComparisonStrategy.ONE_TLU_PROMOTED.

Example

import numpy as np
from concrete import fhe

configuration = fhe.Configuration(
    comparison_strategy_preference=fhe.ComparisonStrategy.THREE_TLU_CASTED,
)

def f(x, y):
    return x < y

inputset = [
    (np.random.randint(0, 2**4), np.random.randint(0, 2**4))
    for _ in range(100)
]

compiler = fhe.Compiler(f, {"x": "encrypted", "y": "encrypted"})
circuit = compiler.compile(inputset, configuration, show_mlir=True)

produces

module {
  
  // no promotions
  func.func @main(%arg0: !FHE.eint<3>, %arg1: !FHE.eint<6>) -> !FHE.eint<1> {
    
    // casting
    %cst = arith.constant dense<[0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15]> : tensor<16xi64>
    %0 = "FHE.apply_lookup_table"(%arg0, %cst) : (!FHE.eint<4>, tensor<16xi64>) -> !FHE.esint<5>
    %1 = "FHE.apply_lookup_table"(%arg1, %cst) : (!FHE.eint<4>, tensor<16xi64>) -> !FHE.esint<5>
    
    // subtraction
    %2 = "FHE.sub_eint"(%0, %1) : (!FHE.esint<5>, !FHE.esint<5>) -> !FHE.esint<5>
    
    // computing the result
    %cst_0 = arith.constant dense<[0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1]> : tensor<32xi64>
    %3 = "FHE.apply_lookup_table"(%2, %cst_0) : (!FHE.esint<5>, tensor<32xi64>) -> !FHE.eint<1>
    
    return %3 : !FHE.eint<1>
    
  }
  
}

3. fhe.ComparisonStrategy.TWO_TLU_BIGGER_PROMOTED_SMALLER_CASTED

This strategy can be seen as a middle ground between the two strategies described above. With this strategy, only the bigger operand will be constrained to have at least the required bit-width to store x - y, and the smaller operand will be cast to that bit-width during runtime. The idea is:

uint3_to_uint7_lut = fhe.LookupTable([...])
x_cast_to_uint7 = uint3_to_uint7_lut[x]

comparison_lut = fhe.LookupTable([...])
result = comparison_lut[x_cast_to_uint7 - y_promoted_to_uint7]

Notes

  • It can result in a single table lookup, if the smaller operand is assigned (because of other operations) the same bit-width as the bigger operand.

Pros

  • It will only put a constraint on the bigger operand, which is great if the smaller operand is used in other costly operations.

  • It will result in at most 2 table lookups, which is great.

Cons

  • It will increase the bit-width of the bigger operand, which can result in significant slowdowns if the bigger operand is used in other costly operations.

  • If you are not doing anything else with the smaller operand, or doing less costly operations compared to comparison, it could introduce an unnecessary table lookup and slow down execution compared to fhe.ComparisonStrategy.THREE_TLU_CASTED.

Example

import numpy as np
from concrete import fhe

configuration = fhe.Configuration(
    comparison_strategy_preference=fhe.ComparisonStrategy.TWO_TLU_BIGGER_PROMOTED_SMALLER_CASTED,
)

def f(x, y):
    return x < y

inputset = [
    (np.random.randint(0, 2**3), np.random.randint(0, 2**5))
    for _ in range(100)
]

compiler = fhe.Compiler(f, {"x": "encrypted", "y": "encrypted"})
circuit = compiler.compile(inputset, configuration, show_mlir=True)

produces

module {
  
  // promotions                               ............
  func.func @main(%arg0: !FHE.eint<3>, %arg1: !FHE.eint<6>) -> !FHE.eint<1> {
    
    // casting the smaller operand
    %cst = arith.constant dense<[0, 1, 2, 3, 4, 5, 6, 7]> : tensor<8xi64>
    %0 = "FHE.apply_lookup_table"(%arg0, %cst) : (!FHE.eint<3>, tensor<8xi64>) -> !FHE.esint<6>
    
    // subtraction
    %1 = "FHE.to_signed"(%arg1) : (!FHE.eint<6>) -> !FHE.esint<6>
    %2 = "FHE.sub_eint"(%0, %1) : (!FHE.esint<6>, !FHE.esint<6>) -> !FHE.esint<6>
    
    // computing the result
    %cst_0 = arith.constant dense<[0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1]> : tensor<64xi64>
    %3 = "FHE.apply_lookup_table"(%2, %cst_0) : (!FHE.esint<6>, tensor<64xi64>) -> !FHE.eint<1>
    
    return %3 : !FHE.eint<1>
    
  }
  
}

4. fhe.ComparisonStrategy.TWO_TLU_BIGGER_CASTED_SMALLER_PROMOTED

This strategy can be seen as the exact opposite of the strategy above. With this, only the smaller operand will be constrained to have at least the required bit-width, and the bigger operand will be cast during runtime. The idea is:

uint6_to_uint7_lut = fhe.LookupTable([...])
y_cast_to_uint7 = uint6_to_uint7_lut[y]

comparison_lut = fhe.LookupTable([...])
result = comparison_lut[x_promoted_to_uint7 - y_cast_to_uint7]

Notes

  • It can result in a single table lookup, if the bigger operand is assigned (because of other operations) the same bit-width as the smaller operand.

Pros

  • It will only put a constraint on the smaller operand, which is great if the bigger operand is used in other costly operations.

  • It will result in at most 2 table lookups, which is great.

Cons

  • It will increase the bit-width of the smaller operand, which can result in significant slowdowns if the smaller operand is used in other costly operations.

  • If you are not doing anything else with the bigger operand, or doing less costly operations compared to comparison, it could introduce an unnecessary table lookup and slow down execution compared to fhe.ComparisonStrategy.THREE_TLU_CASTED.

Example

import numpy as np
from concrete import fhe

configuration = fhe.Configuration(
    comparison_strategy_preference=fhe.ComparisonStrategy.TWO_TLU_BIGGER_PROMOTED_SMALLER_CASTED,
)

def f(x, y):
    return x < y

inputset = [
    (np.random.randint(0, 2**3), np.random.randint(0, 2**5))
    for _ in range(100)
]

compiler = fhe.Compiler(f, {"x": "encrypted", "y": "encrypted"})
circuit = compiler.compile(inputset, configuration, show_mlir=True)

produces

module {
  
  // promotions          ............
  func.func @main(%arg0: !FHE.eint<6>, %arg1: !FHE.eint<5>) -> !FHE.eint<1> {
    
    // casting the bigger operand
    %cst = arith.constant dense<[0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31]> : tensor<32xi64>
    %0 = "FHE.apply_lookup_table"(%arg1, %cst) : (!FHE.eint<5>, tensor<32xi64>) -> !FHE.esint<6>
    
    // subtraction
    %1 = "FHE.to_signed"(%arg0) : (!FHE.eint<6>) -> !FHE.esint<6>
    %2 = "FHE.sub_eint"(%1, %0) : (!FHE.esint<6>, !FHE.esint<6>) -> !FHE.esint<6>
    
    // computing the result
    %cst_0 = arith.constant dense<[0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1]> : tensor<64xi64>
    %3 = "FHE.apply_lookup_table"(%2, %cst_0) : (!FHE.esint<6>, tensor<64xi64>) -> !FHE.eint<1>
    
    return %3 : !FHE.eint<1>
    
  }
  
}

Clipping Trick

This implementation uses the fact that the subtraction trick is not optimal in terms of the required intermediate bit width. The comparison result does not change if we compare(3, 40) or compare(3, 4), so why not clipping the bigger operand and then doing the subtraction to use less bits!

There are two major problems with this implementation:

  1. it can not be used when the bit-widths are the same (for some cases even when they differ by only one bit)

  2. subtraction still requires the same bit-width across operands.

What this means is if we are comparing uint3 and uint6, we need to convert both of them to uint4 in some way to do the subtraction and proceed with the TLU in 7-bits. There are 2 ways to achieve this behavior.

Requirements

  • x.bit_width != y.bit_width
  • smaller = x if x.bit_width < y.bit_width else y
    bigger = x if x.bit_width > y.bit_width else y
    clipped = lambda value: np.clip(value, smaller.min() - 1, smaller.max() + 1)
    any(
        (
            bit_width <= MAXIMUM_TLU_BIT_WIDTH and
            bit_width <= bigger.dtype.bit_width and
            bit_width > smaller.dtype.bit_width
        )
        for bit_width in [
            (smaller - clipped(bigger)).bit_width,
            (clipped(bigger) - smaller).bit_width,
        ]
      )

1. fhe.ComparisonStrategy.THREE_TLU_BIGGER_CLIPPED_SMALLER_CASTED

This strategy will not put any constraint on bit-widths during bit-width assignment, instead the smaller operand is cast to a bit-width that can store clipped(bigger) - smaller or smaller - clipped(bigger) during runtime using table lookups. The idea is:

uint3_to_uint4_lut = fhe.LookupTable([...])
x_cast_to_uint4 = uint3_to_uint4_lut[x]

clipper = fhe.LookupTable([...])
y_clipped = clipper[y]

comparison_lut = fhe.LookupTable([...])
result = comparison_lut[x_cast_to_uint4 - y_clipped]
# or
another_comparison_lut = fhe.LookupTable([...])
result = another_comparison_lut[y_clipped - x_cast_to_uint4]

Notes

  • This is a fallback implementation, so if there is a difference of 1-bit (or in some cases 2-bits) and the subtraction trick cannot be used optimally, this implementation will be used instead of fhe.ComparisonStrategy.CHUNKED.

  • It can result in two table lookups if the smaller operand is assigned a bit-width bigger than or equal to the bit width that can store clipped(bigger) - smaller or smaller - clipped(bigger).

Pros

  • It will not put any constraints on the bit-widths of the operands, which is amazing if they are used in other costly operations.

  • It will result in at most 3 table lookups, which is still good.

  • These table lookups will be on smaller bit-widths, which is great.

Cons

  • Cannot be used to compare integers with the same bit-width, which is very common.

Example

import numpy as np
from concrete import fhe

configuration = fhe.Configuration(
    comparison_strategy_preference=fhe.ComparisonStrategy.THREE_TLU_BIGGER_CLIPPED_SMALLER_CASTED
)

def f(x, y):
    return x < y

inputset = [
    (np.random.randint(0, 2**3), np.random.randint(0, 2**6))
    for _ in range(100)
]

compiler = fhe.Compiler(f, {"x": "encrypted", "y": "encrypted"})
circuit = compiler.compile(inputset, configuration, show_mlir=True)

produces

module {
  
  // no promotions
  func.func @main(%arg0: !FHE.eint<3>, %arg1: !FHE.eint<6>) -> !FHE.eint<1> {
    
    // casting the smaller operand 
    %cst = arith.constant dense<[0, 1, 2, 3, 4, 5, 6, 7]> : tensor<8xi64>
    %0 = "FHE.apply_lookup_table"(%arg0, %cst) : (!FHE.eint<3>, tensor<8xi64>) -> !FHE.esint<4>
    
    // clipping the bigger operand
    %cst_0 = arith.constant dense<[0, 1, 2, 3, 4, 5, 6, 7, 8, 8, 8, 8, 8, 8, 8, 8, 8, 8, 8, 8, 8, 8, 8, 8, 8, 8, 8, 8, 8, 8, 8, 8, 8, 8, 8, 8, 8, 8, 8, 8, 8, 8, 8, 8, 8, 8, 8, 8, 8, 8, 8, 8, 8, 8, 8, 8, 8, 8, 8, 8, 8, 8, 8, 8]> : tensor<64xi64>
    %1 = "FHE.apply_lookup_table"(%arg1, %cst_0) : (!FHE.eint<6>, tensor<64xi64>) -> !FHE.esint<4>
    
    // subtraction
    %2 = "FHE.sub_eint"(%0, %1) : (!FHE.esint<4>, !FHE.esint<4>) -> !FHE.esint<4>
    
    // computing the result
    %cst_1 = arith.constant dense<[0, 0, 0, 0, 0, 0, 0, 0, 1, 1, 1, 1, 1, 1, 1, 1]> : tensor<16xi64>
    %3 = "FHE.apply_lookup_table"(%2, %cst_1) : (!FHE.esint<4>, tensor<16xi64>) -> !FHE.eint<1>
    
    return %3 : !FHE.eint<1>
    
  }
  
}

2. fhe.ComparisonStrategy.TWO_TLU_BIGGER_CLIPPED_SMALLER_PROMOTED

This strategy is similar to the strategy described above. The difference is that with this strategy, the smaller operand will be constrained to have at least the required bit-width to store clipped(bigger) - smaller or smaller - clipped(bigger). The bigger operand will still be clipped to that bit-width during runtime. The idea is:

clipper = fhe.LookupTable([...])
y_clipped = clipper[y]

comparison_lut = fhe.LookupTable([...])
result = comparison_lut[x_promoted_to_uint4 - y_clipped]
# or
another_comparison_lut = fhe.LookupTable([...])
result = another_comparison_lut[y_clipped - x_promoted_to_uint4]

Pros

  • It will only put a constraint on the smaller operand, which is great if the bigger operand is used in other costly operations.

  • It will result in exactly 2 table lookups, which is great.

Cons

  • It will increase the bit-width of the bigger operand, which can result in significant slowdowns if the bigger operand is used in other costly operations.

Example

import numpy as np
from concrete import fhe

configuration = fhe.Configuration(
    comparison_strategy_preference=fhe.ComparisonStrategy.TWO_TLU_BIGGER_CLIPPED_SMALLER_PROMOTED
)

def f(x, y):
    return x < y

inputset = [
    (np.random.randint(0, 2**3), np.random.randint(0, 2**6))
    for _ in range(100)
]

compiler = fhe.Compiler(f, {"x": "encrypted", "y": "encrypted"})
circuit = compiler.compile(inputset, configuration, show_mlir=True)

produces

module {
  
  // promotions          ............
  func.func @main(%arg0: !FHE.eint<4>, %arg1: !FHE.eint<6>) -> !FHE.eint<1> {
    
    // clipping the bigger operand
    %cst = arith.constant dense<[0, 1, 2, 3, 4, 5, 6, 7, 8, 8, 8, 8, 8, 8, 8, 8, 8, 8, 8, 8, 8, 8, 8, 8, 8, 8, 8, 8, 8, 8, 8, 8, 8, 8, 8, 8, 8, 8, 8, 8, 8, 8, 8, 8, 8, 8, 8, 8, 8, 8, 8, 8, 8, 8, 8, 8, 8, 8, 8, 8, 8, 8, 8, 8]> : tensor<64xi64>
    %0 = "FHE.apply_lookup_table"(%arg1, %cst) : (!FHE.eint<6>, tensor<64xi64>) -> !FHE.esint<4>
    
    // subtraction
    %1 = "FHE.to_signed"(%arg0) : (!FHE.eint<4>) -> !FHE.esint<4>
    %2 = "FHE.sub_eint"(%1, %0) : (!FHE.esint<4>, !FHE.esint<4>) -> !FHE.esint<4>
        
    // computing the result
    %cst_0 = arith.constant dense<[0, 0, 0, 0, 0, 0, 0, 0, 1, 1, 1, 1, 1, 1, 1, 1]> : tensor<16xi64>
    %3 = "FHE.apply_lookup_table"(%2, %cst_0) : (!FHE.esint<4>, tensor<16xi64>) -> !FHE.eint<1>
    
    return %3 : !FHE.eint<1>
    
  }
  
}

Summary

Strategy
Minimum # of TLUs
Maximum # of TLUs
Can increase the bit-width of the inputs

CHUNKED

5

13

ONE_TLU_PROMOTED

1

1

✓

THREE_TLU_CASTED

1

3

TWO_TLU_BIGGER_PROMOTED_SMALLER_CASTED

1

2

✓

TWO_TLU_BIGGER_CASTED_SMALLER_PROMOTED

1

2

✓

THREE_TLU_BIGGER_CLIPPED_SMALLER_CASTED

2

3

TWO_TLU_BIGGER_CLIPPED_SMALLER_PROMOTED

2

2

✓

Concrete will choose the best strategy available after bit-width assignment, regardless of the specified preference.

Different strategies are good for different circuits. If you want the best runtime for your use case, you can compile your circuit with all different comparison strategy preferences, and pick the one with the lowest complexity.

Bit extraction

This document provides an overview of the bit extraction feature in Concrete, including usage examples, limitations, and performance considerations.

Overview

Bit extraction could be useful in some applications that require directly manipulating bits of integers. Bit extraction allows you to extract a specific slice of bits from an integer, where index 0 corresponds to the least significant bit (LSB). The cost of this operation increases with the index of the highest significant bit you wish to extract.

Bit extraction only works in the Native encoding, which is usually selected when all table lookups in the circuit are less than or equal to 8 bits.

Extracting a specific bit

from concrete import fhe

@fhe.compiler({"x": "encrypted"})
def f(x):
    return fhe.bits(x)[0], fhe.bits(x)[3]

inputset = range(32)
circuit = f.compile(inputset)

assert circuit.encrypt_run_decrypt(0b_00000) == (0, 0)
assert circuit.encrypt_run_decrypt(0b_00001) == (1, 0)

assert circuit.encrypt_run_decrypt(0b_01100) == (0, 1)
assert circuit.encrypt_run_decrypt(0b_01101) == (1, 1)

Extracting multiple bits with slices

You can use slices for indexing fhe.bits(value) :

from concrete import fhe

@fhe.compiler({"x": "encrypted"})
def f(x):
    return fhe.bits(x)[1:4]

inputset = range(32)
circuit = f.compile(inputset)

assert circuit.encrypt_run_decrypt(0b_01101) == 0b_110
assert circuit.encrypt_run_decrypt(0b_01011) == 0b_101

Bit extraction supports slices with negative steps:

from concrete import fhe

@fhe.compiler({"x": "encrypted"})
def f(x):
    return fhe.bits(x)[3:0:-1]

inputset = range(32)
circuit = f.compile(inputset)

assert circuit.encrypt_run_decrypt(0b_01101) == 0b_011
assert circuit.encrypt_run_decrypt(0b_01011) == 0b_101

Bit extraction with signed integers

Bit extraction supports signed integers:

from concrete import fhe

@fhe.compiler({"x": "encrypted"})
def f(x):
    return fhe.bits(x)[1:3]

inputset = range(-16, 16)
circuit = f.compile(inputset)

assert circuit.encrypt_run_decrypt(-14) == 0b_01  # -14 == 0b_10010 (in two's complement)
assert circuit.encrypt_run_decrypt(-12) == 0b_10  # -12 == 0b_10100 (in two's complement)

Use case example

Here's a practical example that uses bit extraction to determine if a number is even:

import numpy as np
from concrete import fhe

@fhe.compiler({"x": "encrypted"})
def is_even(x):
    return 1 - fhe.bits(x)[0]

inputset = [
    np.random.randint(-16, 16, size=(5,))
    for _ in range(100)
]
circuit = is_even.compile(inputset)

sample = np.random.randint(-16, 16, size=(5,))
for value, value_is_even in zip(sample, circuit.encrypt_run_decrypt(sample)):
    print(f"{value} is {'even' if value_is_even else 'odd'}")

It prints:

13 is odd
0 is even
-15 is odd
2 is even
-6 is even

Limitations

  • Negative indexing is not supported: Bits extraction using negative indices is not supported, such as fhe.bits(x)[-1].

    • This is because the bit-width of x is unknown before inputset evaluation, making it impossible to determine the correct bit to extract.

  • Reverse slicing requires explicit starting bit: When extracting bits in reverse order (using a negative step), the start bit must be specified, for example, fhe.bits(x)[::-1] is not supported.

  • Signed integer slicing requires explicit stopping bit: For signed integers, when using slices, the stop bit must be explicitly provided, for example, fhe.bits(x)[1:] is not supported.

  • Float bit extraction is not supported: While Concrete supports floats to some extent, bit extraction is not possible on float types.

Performance considerations

A Chain of individual bit extractions

Extracting a specific bit requires clearing all the preceding lower bits. This involves extracting these previous bits as intermediate values and then subtracting them from the input.

Implications:

  • Bits are extracted sequentially, starting from the least significant bit to the more significant ones. The cost is proportional to the index of the highest extracted bit plus one.

  • No parallelization is possible. The computation time is proportional to the cost, independent of the number of CPUs.

Examples:

  • Extracting fhe.bits(x)[4] is approximately five times costlier than extracting fhe.bits(x)[0].

  • Extracting fhe.bits(x)[4] takes around five times more wall clock time than fhe.bits(x)[0].

  • The cost of extracting fhe.bits(x)[0:5] is almost the same as that of fhe.bits(x)[5].

Reuse of Intermediate Extracted Bits

Common sub-expression elimination is applied to intermediate extracted bits.

Implications:

  • The overall cost for a series of fhe.bits(x)[m:n] calls on the same input x is almost equivalent to the cost of the single most computationally expensive extraction in the series, i.e. fhe.bits(x)[n].

  • The order of extraction in that series does not affect the overall cost.

Example:

The combined operation fhe.bit(x)[3] + fhe.bit(x)[2] + fhe.bit(x)[1] has almost the same cost as fhe.bits(x)[3].

TLUs of 1b input precision

Each extracted bit incurs a cost of approximately one TLU of 1-bit input precision. Therefore, fhe.bits(x)[0] is generally faster than any other TLU operation.

documentation

Frontend fusing

This document describes the concept of fusing, which is the act of combining multiple nodes into a single node, which is converted to a Table Lookup.

How is it done?

Code related to fusing is in the frontends/concrete-python/concrete/fhe/compilation/utils.py file. Fusing can be performed using the fuse function.

Within fuse:

  1. We loop until there are no more subgraphs to fuse.

  2. Within each iteration: 2.1. We find a subgraph to fuse.

    2.2. We search for a terminal node that is appropriate for fusing.

    2.3. We crawl backwards to find the closest integer nodes to this node.

    2.4. If there is a single node as such, we return the subgraph from this node to the terminal node.

    2.5. Otherwise, we try to find the lowest common ancestor (lca) of this list of nodes.

    2.6. If an lca doesn't exist, we say this particular terminal node is not fusable, and we go back to search for another subgraph.

    2.7. Otherwise, we use this lca as the input of the subgraph and continue with subgraph node creation below.

    2.8. We convert the subgraph into a subgraph node by checking fusability status of the nodes of the subgraph in this step.

    2.9. We substitute the subgraph node to the original graph.

Limitations

With the current implementation, we cannot fuse subgraphs that depend on multiple encrypted values where those values don't have a common lca (e.g., np.round(np.sin(x) + np.cos(y))).

*Using the default configuration in approximate mode. For 3, 4, 5 and 6 reduced precision bits and accumulator precision up to 32bits

In blue the exact value, the red dots are approximate values due to off-centered transition in approximate mode.

Histogram of transitions off-centering delta. Each count correspond to a specific random mask and a specific encryption noise.

Only the last step is clipped.

The last steps are decreased.

API

Modules

Classes

Functions

Cover

Fundamentals

Explore the core features.

Cover

Guides

Deploy your project.

Cover

Tutorials

Learn more with tutorials.

Floating points

This document describes how floating points are treated and manipulated in Concrete.

Floating points as intermediate values

Concrete-Compile, which is used for compiling the circuit, doesn't support floating points at all. However, it supports table lookups which take an integer and map it to another integer. The constraints of this operation are that there should be a single integer input, and a single integer output.

As long as your floating point operations comply with those constraints, Concrete automatically converts them to a table lookup operation:

from concrete import fhe
import numpy as np

@fhe.compiler({"x": "encrypted"})
def f(x):
    a = x + 1.5
    b = np.sin(x)
    c = np.around(a + b)
    d = c.astype(np.int64)
    return d

inputset = range(8)
circuit = f.compile(inputset)

for x in range(8):
    assert circuit.encrypt_run_decrypt(x) == f(x)

In the example above, a, b, and c are floating point intermediates. They are used to calculate d, which is an integer with a value dependent upon x, which is also an integer. Concrete detects this and fuses all of these operations into a single table lookup from x to d.

This approach works for a variety of use cases, but it comes up short for others:

from concrete import fhe
import numpy as np

@fhe.compiler({"x": "encrypted", "y": "encrypted"})
def f(x, y):
    a = x + 1.5
    b = np.sin(y)
    c = np.around(a + b)
    d = c.astype(np.int64)
    return d

inputset = [(1, 2), (3, 0), (2, 2), (1, 3)]
circuit = f.compile(inputset)

for x in range(8):
    assert circuit.encrypt_run_decrypt(x) == f(x)

This results in:

RuntimeError: Function you are trying to compile cannot be converted to MLIR

%0 = x                             # EncryptedScalar<uint2>
%1 = 1.5                           # ClearScalar<float64>
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ only integer constants are supported
%2 = y                             # EncryptedScalar<uint2>
%3 = add(%0, %1)                   # EncryptedScalar<float64>
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ only integer operations are supported
%4 = sin(%2)                       # EncryptedScalar<float64>
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ only integer operations are supported
%5 = add(%3, %4)                   # EncryptedScalar<float64>
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ only integer operations are supported
%6 = around(%5)                    # EncryptedScalar<float64>
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ only integer operations are supported
%7 = astype(%6, dtype=int_)        # EncryptedScalar<uint3>
return %7

The reason for the error is that d no longer depends solely on x; it depends on y as well. Concrete cannot fuse these operations, so it raises an exception instead.

Supported operations

This document lists the operations you can use inside the function that you are compiling.

Some operations are not supported between two encrypted values. If attempted, a detailed error message will be raised.

Supported Python operators.

Supported NumPy functions.

Supported ndarray methods.

Supported ndarray properties.

Truncating

This document details the concept of truncating, and how it is used in Concrete to make some FHE computations especially faster.

Table lookups have a strict constraint on the number of bits they support. This can be limiting, especially if you don't need exact precision. As well as this, using larger bit-widths leads to slower table lookups.

To overcome these issues, truncated table lookups are introduced. This operation provides a way to zero the least significant bits of a large integer and then apply the table lookup on the resulting (smaller) value.

Imagine you have a 5-bit value, you can use fhe.truncate_bit_pattern(value, lsbs_to_remove=2) to truncate it (here the last 2 bits are discarded). Once truncated, value will remain in 5-bits (e.g., 22 = 0b10110 would be truncated to 20 = 0b10100), and the last 2 bits of it would be zero. Concrete uses this to optimize table lookups on the truncated value, the 5-bit table lookup gets optimized to a 3-bit table lookup, which is much faster!

Let's see how truncation works in practice:

import matplotlib.pyplot as plt
import numpy as np
from concrete import fhe

original_bit_width = 5
lsbs_to_remove = 2

assert 0 < lsbs_to_remove < original_bit_width

original_values = list(range(2**original_bit_width))
truncated_values = [
    fhe.truncate_bit_pattern(value, lsbs_to_remove)
    for value in original_values
]

previous_truncated = truncated_values[0]
for original, truncated in zip(original_values, truncated_values):
    if truncated != previous_truncated:
        previous_truncated = truncated
        print()

    original_binary = np.binary_repr(original, width=(original_bit_width + 1))
    truncated_binary = np.binary_repr(truncated, width=(original_bit_width + 1))

    print(
        f"{original:2} = 0b_{original_binary[:-lsbs_to_remove]}[{original_binary[-lsbs_to_remove:]}] "
        f"=> "
        f"0b_{truncated_binary[:-lsbs_to_remove]}[{truncated_binary[-lsbs_to_remove:]}] = {truncated}"
    )

fig = plt.figure()
ax = fig.add_subplot()

plt.plot(original_values, original_values, label="original", color="black")
plt.plot(original_values, truncated_values, label="truncated", color="green")
plt.legend()

ax.set_aspect("equal", adjustable="box")
plt.show()

prints:

 0 = 0b_0000[00] => 0b_0000[00] = 0
 1 = 0b_0000[01] => 0b_0000[00] = 0
 2 = 0b_0000[10] => 0b_0000[00] = 0
 3 = 0b_0000[11] => 0b_0000[00] = 0

 4 = 0b_0001[00] => 0b_0001[00] = 4
 5 = 0b_0001[01] => 0b_0001[00] = 4
 6 = 0b_0001[10] => 0b_0001[00] = 4
 7 = 0b_0001[11] => 0b_0001[00] = 4

 8 = 0b_0010[00] => 0b_0010[00] = 8
 9 = 0b_0010[01] => 0b_0010[00] = 8
10 = 0b_0010[10] => 0b_0010[00] = 8
11 = 0b_0010[11] => 0b_0010[00] = 8

12 = 0b_0011[00] => 0b_0011[00] = 12
13 = 0b_0011[01] => 0b_0011[00] = 12
14 = 0b_0011[10] => 0b_0011[00] = 12
15 = 0b_0011[11] => 0b_0011[00] = 12

16 = 0b_0100[00] => 0b_0100[00] = 16
17 = 0b_0100[01] => 0b_0100[00] = 16
18 = 0b_0100[10] => 0b_0100[00] = 16
19 = 0b_0100[11] => 0b_0100[00] = 16

20 = 0b_0101[00] => 0b_0101[00] = 20
21 = 0b_0101[01] => 0b_0101[00] = 20
22 = 0b_0101[10] => 0b_0101[00] = 20
23 = 0b_0101[11] => 0b_0101[00] = 20

24 = 0b_0110[00] => 0b_0110[00] = 24
25 = 0b_0110[01] => 0b_0110[00] = 24
26 = 0b_0110[10] => 0b_0110[00] = 24
27 = 0b_0110[11] => 0b_0110[00] = 24

28 = 0b_0111[00] => 0b_0111[00] = 28
29 = 0b_0111[01] => 0b_0111[00] = 28
30 = 0b_0111[10] => 0b_0111[00] = 28
31 = 0b_0111[11] => 0b_0111[00] = 28

and displays:

Now, let's see how truncating can be used in FHE.

import itertools
import time

import matplotlib.pyplot as plt
import numpy as np
from concrete import fhe

configuration = fhe.Configuration(
    enable_unsafe_features=True,
    use_insecure_key_cache=True,
    insecure_key_cache_location=".keys",
)

input_bit_width = 6
input_range = np.array(range(2**input_bit_width))

timings = {}
results = {}

for lsbs_to_remove in range(input_bit_width):
    @fhe.compiler({"x": "encrypted"})
    def f(x):
        return fhe.truncate_bit_pattern(x, lsbs_to_remove) ** 2
    
    circuit = f.compile(inputset=[input_range], configuration=configuration)
    circuit.keygen()
    
    encrypted_sample = circuit.encrypt(input_range)
    start = time.time()
    encrypted_result = circuit.run(encrypted_sample)
    end = time.time()
    result = circuit.decrypt(encrypted_result)
    
    took = end - start
    
    timings[lsbs_to_remove] = took
    results[lsbs_to_remove] = result

number_of_figures = len(results)

columns = 1
for i in range(2, number_of_figures):
    if number_of_figures % i == 0:
        columns = i
rows = number_of_figures // columns

fig, axs = plt.subplots(rows, columns)
axs = axs.flatten()

baseline = timings[0]
for lsbs_to_remove in range(input_bit_width):
    timing = timings[lsbs_to_remove]
    speedup = baseline / timing
    print(f"lsbs_to_remove={lsbs_to_remove} => {speedup:.2f}x speedup")

    axs[lsbs_to_remove].set_title(f"lsbs_to_remove={lsbs_to_remove}")
    axs[lsbs_to_remove].plot(input_range, results[lsbs_to_remove])

plt.show()

prints:

lsbs_to_remove=0 => 1.00x speedup
lsbs_to_remove=1 => 1.69x speedup
lsbs_to_remove=2 => 3.48x speedup
lsbs_to_remove=3 => 3.06x speedup
lsbs_to_remove=4 => 3.46x speedup
lsbs_to_remove=5 => 3.14x speedup

These speed-ups can vary from system to system.

The reason why the speed-up is not increasing with lsbs_to_remove is because the truncating operation itself has a cost: each bit removal is a PBS. Therefore, if a lot of bits are removed, truncation itself could take longer than the bigger TLU which is evaluated afterwards.

and displays:

Auto Truncators

Truncating is very useful but, in some cases, you don't know how many bits your input contains, so it's not reliable to specify lsbs_to_remove manually. For this reason, the AutoTruncator class is introduced.

AutoTruncator allows you to set how many of the most significant bits to keep, but they need to be adjusted using an inputset to determine how many of the least significant bits to remove. This can be done manually using fhe.AutoTruncator.adjust(function, inputset), or by setting auto_adjust_truncators configuration to True during compilation.

Here is how auto truncators can be used in FHE:

import itertools
import time

import matplotlib.pyplot as plt
import numpy as np
from concrete import fhe

configuration = fhe.Configuration(
    enable_unsafe_features=True,
    use_insecure_key_cache=True,
    insecure_key_cache_location=".keys",
    single_precision=False,
    parameter_selection_strategy=fhe.ParameterSelectionStrategy.MULTI,
)

input_bit_width = 6
input_range = np.array(range(2**input_bit_width))

timings = {}
results = {}

for target_msbs in reversed(range(1, input_bit_width + 1)):
    truncator = fhe.AutoTruncator(target_msbs)

    @fhe.compiler({"x": "encrypted"})
    def f(x):
        return fhe.truncate_bit_pattern(x, lsbs_to_remove=truncator) ** 2

    fhe.AutoTruncator.adjust(f, inputset=[input_range])

    circuit = f.compile(inputset=[input_range], configuration=configuration)
    circuit.keygen()

    encrypted_sample = circuit.encrypt(input_range)
    start = time.time()
    encrypted_result = circuit.run(encrypted_sample)
    end = time.time()
    result = circuit.decrypt(encrypted_result)

    took = end - start

    timings[target_msbs] = took
    results[target_msbs] = result

number_of_figures = len(results)

columns = 1
for i in range(2, number_of_figures):
    if number_of_figures % i == 0:
        columns = i
rows = number_of_figures // columns

fig, axs = plt.subplots(rows, columns)
axs = axs.flatten()

baseline = timings[input_bit_width]
for i, target_msbs in enumerate(reversed(range(1, input_bit_width + 1))):
    timing = timings[target_msbs]
    speedup = baseline / timing
    print(f"target_msbs={target_msbs} => {speedup:.2f}x speedup")

    axs[i].set_title(f"target_msbs={target_msbs}")
    axs[i].plot(input_range, results[target_msbs])

plt.show()

prints:

target_msbs=6 => 1.00x speedup
target_msbs=5 => 1.80x speedup
target_msbs=4 => 3.47x speedup
target_msbs=3 => 3.02x speedup
target_msbs=2 => 3.38x speedup
target_msbs=1 => 3.37x speedup

and displays:

AutoTruncators should be defined outside the function that is being compiled. They are used to store the result of the adjustment process, so they shouldn't be created each time the function is called. Furthermore, each AutoTruncator should be used with exactly one truncate_bit_pattern call.

Direct circuits

This document explains the concept of direct circuits in Concrete, which is another way to compile circuit without having to give a proper inputset.

Direct circuits are still experimental. It is very easy to make mistakes (e.g., due to no overflow checks or type coercion) while using direct circuits, so utilize them with care.

For some applications, the data types of inputs, intermediate values, and outputs are known (e.g., for manipulating bytes, you would want to use uint8). Using inputsets to determine bounds in these cases is not necessary, and can even be error-prone. Therefore, another interface for defining such circuits is introduced:

from concrete import fhe

@fhe.circuit({"x": "encrypted"})
def circuit(x: fhe.uint8):
    return x + 42

assert circuit.encrypt_run_decrypt(10) == 52

There are a few differences between direct circuits and traditional circuits:

  • Remember that the resulting dtype for each operation will be determined by its inputs. This can lead to some unexpected results if you're not careful (e.g., if you do -x where x: fhe.uint8, you won't receive a negative value as the result will be fhe.uint8 as well)

  • There is no inputset evaluation when using fhe types in .astype(...) calls (e.g., np.sqrt(x).astype(fhe.uint4)), so the bit width of the output cannot be determined.

  • Be careful with overflows. With inputset evaluation, you'll get bigger bit widths but no overflows. With direct definition, you must ensure that there aren't any overflows manually.

Let's review a more complicated example to see how direct circuits behave:

from concrete import fhe
import numpy as np

def square(value):
    return value ** 2

@fhe.circuit({"x": "encrypted", "y": "encrypted"})
def circuit(x: fhe.uint8, y: fhe.int2):
    a = x + 10
    b = y + 10

    c = np.sqrt(a).round().astype(fhe.uint4)
    d = fhe.univariate(square, outputs=fhe.uint8)(b)

    return d - c

print(circuit)

This prints:

%0 = x                       # EncryptedScalar<uint8>
%1 = y                       # EncryptedScalar<int2>
%2 = 10                      # ClearScalar<uint4>
%3 = add(%0, %2)             # EncryptedScalar<uint8>
%4 = 10                      # ClearScalar<uint4>
%5 = add(%1, %4)             # EncryptedScalar<int4>
%6 = subgraph(%3)            # EncryptedScalar<uint4>
%7 = square(%5)              # EncryptedScalar<uint8>
%8 = subtract(%7, %6)        # EncryptedScalar<uint8>
return %8

Subgraphs:

    %6 = subgraph(%3):

        %0 = input                         # EncryptedScalar<uint8>
        %1 = sqrt(%0)                      # EncryptedScalar<float64>
        %2 = around(%1, decimals=0)        # EncryptedScalar<float64>
        %3 = astype(%2)                    # EncryptedScalar<uint4>
        return %3

Here is the breakdown of the assigned data types:

%0 is uint8 because it's specified in the definition
%1 is  int2 because it's specified in the definition
%2 is uint4 because it's the constant 10
%3 is uint8 because it's the addition between uint8 and uint4
%4 is uint4 because it's the constant 10
%5 is  int4 because it's the addition between int2 and uint4
%6 is uint4 because it's specified in astype
%7 is uint8 because it's specified in univariate
%8 is uint8 because it's subtraction between uint8 and uint4

As you can see, %8 is subtraction of two unsigned values, and the result is unsigned as well. In the case that c > d, we have an overflow, and this results in undefined behavior.

Installation

This document explains the steps to install Concrete into your project.

Concrete is natively supported on Linux and macOS from Python 3.8 to 3.11 inclusive. If you have Docker in your platform, you can use the docker image to use Concrete.

Using PyPI

Install Concrete from PyPI using the following commands:

pip install -U pip wheel setuptools
pip install concrete-python

Not all versions are available on PyPI. If you need a version that is not on PyPI (including nightly releases), you can install it from our package index by adding --extra-index-url https://pypi.zama.ai/cpu/. GPU wheels are also available under https://pypi.zama.ai/gpu/ (check https://pypi.zama.ai/ for all available platforms).

To enable all the optional features, install the full version of Concrete:

pip install -U pip wheel setuptools
pip install concrete-python[full]

Not all versions are available on PyPI. If you need a version that is not on PyPI (including nightly releases), you can install it from our package index by adding --extra-index-url https://pypi.zama.ai/cpu.

If you're using homebrew, you can try the following way:

brew install graphviz
CFLAGS=-I$(brew --prefix graphviz)/include LDFLAGS=-L$(brew --prefix graphviz)/lib pip --no-cache-dir install pygraphviz

before running:

pip install concrete-python[full]

Using Docker

You can also get the Concrete docker image. Replace v2.4.0 below by the version you want to install:

docker pull zamafhe/concrete-python:v2.4.0
docker run --rm -it zamafhe/concrete-python:latest /bin/bash

Docker is not supported on Apple Silicon.

Cryptography basics

Operations on encrypted values

Noise and Bootstrap

FHE encrypts data as LWE ciphertexts, represented visually as a bit vector. The encrypted message is located in the higher-order (yellow) bits, while the lower-order (gray) bits contain random noise that ensures the security of the ciphertext.

Each operation on an encrypted value increases the noise, and if it becomes too large, it may overlap with the message and corrupt its value. To reduce the noise of a ciphertext, the Bootstrap operation generates a new ciphertext encrypting the same message, but with lower noise. This allows additional operations to be performed on the encrypted message.

In typical FHE programs, operations are followed by a bootstrap, and this sequence repeats multiple times.

Probability of Error

The amount of noise in a ciphertext is not as bounded as it may appear in the above illustration. As the errors are drawn randomly from a Gaussian distribution, they can be of varying size. This means that we need to be careful to ensure the noise terms do not affect the message bits. If the error terms do overflow into the message bits, this can cause an incorrect output (failure) when bootstrapping.

The noise in a ciphertext isn't strictly bounded, as errors are drawn from a Gaussian distribution and vary in size. If the noise grows too large, it may corrupt the message bits, causing incorrect outputs during bootstrapping.

Function evaluation

Concrete uses PBS to evaluate functions homomorphically:

For example, consider a function (or circuit) that takes a 4 bits input variable and output the maximum value between a clear constant and the encrypted input:

This function could be turned into a table lookup:

The Lookup table lut being applied during the Programmable Bootstrap.

PBS management

You don't need to manage PBS operations manually, as they are handled automatically by Concrete during the compilation process. Each function evaluation is converted into a lookup table and evaluated via PBS.

For example, if you inspect the MLIR code generated by the frontend, you’ll see the lookup table in the 4th line of the following output:

There are 2 things to keep in mind about PBS:

  • Input type constraints: PBS operations adds constraints on input type and thus limits the maximum bit-width supported in Concrete.

Tagging

This document explains the concept of tagging, which is a debugging tool to make a link between the user's Python code and the Concrete MLIR circuits. Such a link can be useful when an issue is raised by the compiler on some MLIR, to know which Python code it corresponds to.

When you have big circuits, keeping track of which node corresponds to which part of your code becomes difficult. A tagging system can simplify such situations:

When you compile f with inputset of range(10), you get the following graph:

If you get an error, you'll see exactly where the error occurred (e.g., which layer of the neural network, if you tag layers).

In the future, we plan to use tags for additional features (e.g., to measure performance of tagged regions), so it's a good idea to start utilizing them for big circuits.

Project layout

Concrete layout

Concrete is a modular framework composed by sub-projects using different technologies, all having theirs own build system and test suite. Each sub-project have is own README that explain how to setup the developer environment, how to build it and how to run tests commands.

Concrete is made of 4 main categories of sub-project that are organized in subdirectories from the root of the Concrete repo:

  • frontends contains high-level transpilers that target end users developers who want to use the Concrete stack easily from their usual environment. There are for now only one frontend provided by the Concrete project: a Python frontend named concrete-python.

  • compilers contains the sub-projects in charge of actually solving the compilation problem of an high-level abstraction of FHE to an actual executable. concrete-optimizer is a Rust based project that solves the optimization problems of an FHE dag to a TFHE dag and concrete-compiler which use concrete-optimizer is an end-to-end MLIR-based compiler that takes a crypto free FHE dialect and generates compilation artifacts both for the client and the server. concrete-compiler project provide in addition of the compilation engine, a client and server library in order to easily play with the compilation artifacts to implement a client and server protocol.

  • backends contains CAPI that can be called by the concrete-compiler runtime to perform the cryptographic operations. There are currently two backends:

    • concrete-cpu, using TFHE-rs that implement the fastest implementation of TFHE on CPU.

    • concrete-cuda that provides a GPU acceleration of TFHE primitives.

  • tools are basically every other sub-projects that cannot be classified in the three previous categories and which are used as a common support by the others.

Concrete Python layout

The module structure of Concrete Python. You are encouraged to check individual .py files to learn more.

  • concrete

    • fhe

      • dtypes: data type specifications (e.g., int4, uint5, float32)

      • values: value specifications (i.e., data type + shape + encryption status)

      • representation: representation of computation (e.g., computation graphs, nodes)

      • tracing: tracing of python functions

      • mlir: computation graph to mlir conversion

      • compilation: configuration, compiler, artifacts, circuit, client/server, and anything else related to compilation

Compiler backend

There are client and server features.

Client features are:

  • private (G)LWE key generation (currently random bits)

  • encryption of ciphertexts using a private key

  • public key generation from private keys for keyswitch, bootstrap or private packing

  • (de)serialization of ciphertexts and public keys (also needed server side)

Server features are homomorphic operations on ciphertexts:

  • linear operations (multisums with plain weights)

  • keyswitch

  • simple PBS

  • WoP PBS

There are currently 2 backends:

  • concrete-cpu which implements both client and server features targeting the CPU.

  • concrete-cuda which implements only server features targeting GPUs to accelerate homomorphic circuit evaluation.

The compiler uses concrete-cpu for the client and can use either concrete-cpu or concrete-cuda for the server.

Adding a new backend

Context

There are client features (private and public key generation, encryption and decryption) and server features (homomorphic operations on ciphertexts using public keys).

Considering that

  • performance improvements are mostly beneficial for the server operations

  • the client needs to be portable for the variety of clients that may exist, we expect mostly server backend to be added to the compiler to improve performance (e.g. by using specialized hardware)

What is needed in the server backend

The server backend should expose C or C++ functions to do TFHE operations using the current ciphertext and key memory representation (or functions to change representation). A backend can support only a subset of the current TFHE operations.

The most common operations one would be expected to add are WP-PBS (standard TFHE programmable bootstrap), keyswitch and WoP (without padding bootsrap).

Linear operations may also be supported but may need more work since their introduction may interfere with other compilation passes. The following example does not include this.

Concrete-cuda example

We will detail how concrete-cuda is integrated in the compiler. Adding a new server feature backend (for non linear operations) should be quite similar. However, if you want to integrate a backend but it does not fit with this description, please open an issue or contact us to discuss the integration.

In compilers/concrete-compiler/Makefile

  • the variable CUDA_SUPPORT has been added and set to OFF (CUDA_SUPPORT?=OFF) by default

  • the variables CUDA_SUPPORT and CUDA_PATH are passed to CMake

In compilers/concrete-compiler/compiler/include/concretelang/Runtime/context.h, the RuntimeContext struct is enriched with state to manage the backend ressources (behind a #ifdef CONCRETELANG_CUDA_SUPPORT).

In compilers/concrete-compiler/compiler/lib/Runtime/wrappers.cpp, the cuda backend server functions are added (behind a #ifdef CONCRETELANG_CUDA_SUPPORT)

The pass ConcreteToCAPI is modified to have a flag to insert calls to these new wrappers instead of the cpu ones (the code calling this pass is modified accordingly).

It may be possible to replace the cpu wrappers (with a compilation flag) instead of adding new ones to avoid having to change the pass.

In compilers/concrete-compiler/CMakeLists.txt a Section #Concrete Cuda Configuration has been added Other CMakeLists.txt have also been modified (or added) with if(CONCRETELANG_CUDA_SUPPORT) guard to handle header includes, linking...

MLIR FHE dialects

Introduction

Concrete compiler takes advantage of these concepts by defining a set of dialects, capable of representing an FHE program from an abstract specification that is independent of the actual cryptosystem down to a program that can easily be mapped to function calls of a cryptographic library. The dialects for the representation of an FHE program are:

In addition, the project further defines two dialects that help expose dynamic task-parallelism and static data-flow graphs in order to benefit from multi-core, multi-accelerator and distributed systems. These are:

The figure below illustrates the relationship between the dialects and their embedding into the compilation pipeline.

The following sections focus on the FHE-related dialects, i.e., on the FHELinalg Dialect, the FHE Dialect, the TFHE Dialect and the Concrete Dialect.

The FHE and FHELinalg Dialects: An abstract specification of an FHE program

The top part of the figure shows the components which are involved in the generation of the initial IR, ending with the step labelled MLIR translation. When the initial IR is passed on to Concrete Compiler through its Python bindings, all FHE-related operations are specified using either the FHE or FHELinalg Dialect. Both of these dialects provide operations and data types for the abstract specification of an FHE program, completely independently of a cryptosystem. At this point, the IR simply indicates whether an operand is encrypted (via the type FHE.eint<n>, where n stands for the precision in bits) and what operations are applied to encrypted values. Plaintext values simply use MLIR's builtin integer type in (e.g., i3 or i64).

The FHE Dialect provides scalar operations on encrypted integers, such as additions (FHE.add_eint) or multiplications (FHE.mul_eint), while the FHELinalg Dialect offers operations on tensors of encrypted integers, e.g., matrix products (FHELinalg.matmul_eint_eint) or convolutions (FHELinalg.conv2d).

Upon conversion, the FHELinalg.matmul operation is converted to a linalg.generic operation whose body contains a scalar multiplication (FHE.mul_eint_int) and a scalar addition (FHE.add_eint_int):

The TFHE Dialect: Binding to the TFHE cryptosystem and parametrization

In order to obtain an executable program at the end of the compilation pipeline, the abstract specification of the FHE program must at some point be bound to a specific cryptosystem. This is the role of the TFHE Dialect, whose purpose is:

  • to indicate operations to be carried out using an implementation of the TFHE cryptosystem

  • to parametrize the cryptosystem with key sizes, and

  • to provide a mapping between keys and encrypted values

When lowering the IR based on the FHE Dialect to the TFHE Dialect, the compiler first generates a generic form, in which FHE operations are lowered to TFHE operations and where values are converted to unparametrized TFHE.glwe values. The unparametrized form TFHE.glwe<sk?> simply indicates that a TFHE.glwe value is to be used, but without any indication of the cryptographic parameters and the actual key.

The IR below shows the example program after lowering to unparametrized TFHE:

All operations from the FHE dialect have been replaced with corresponding operations from the TFHE Dialect.

During subsequent parametrization, the compiler can either use a set of default parameters or can obtain a set of parameters from Concrete's optimizer. Either way, an additional pass injects the parameters into the IR, replacing all TFHE.glwe<sk?> instances with TFHE.glwe<i,d,n>, where i is a sequential identifier for a key, d the number of GLWE dimensions and n the size of the GLWE polynomial.

The result of such a parametrization for the example is given below:

In this parametrization, a single key with the ID 0 is used, with a single dimension and a polynomial of size 512.

The Concrete Dialect: Preparing bindings with a crypto library

In the next step of the pipeline, operations and types are lowered to the Concrete Dialect. This dialect provides operations, which are implemented by one of Concrete's backend libraries, but still abstracts from any technical details required for interaction with an actual library. The goal is to maintain a high-level representation with value-based semantics and actual operations instead of buffer semantics and library calls, while ensuring that all operations an effectively be lowered to a library call later in the pipeline. However, the abstract types from TFHE are already lowered to tensors of integers with a suitable shape that will hold the binary data of the encrypted values.

The result of the lowering of the example to the Concrete Dialect is shown below:

Bufferization and emitting library calls

The result for the example is given below:

At this stage, the IR is only composed of operations from builtin Dialects and thus amenable to lowering to LLVM-IR using the lowering passes provided by MLIR.

Multi parameters

This document explains the implications and configuration of multi parameters in Concrete.

In Concrete, integers are encrypted and processed based on a set of cryptographic parameters. By default, the Concrete optimizer selects multiple sets of these parameters, which may not be optimal for every use case. In such cases, you can choose to use mono parameters instead.

When multi parameters are enabled, the optimizer selects a different set of parameters for each bit-width in the circuit. This approach has several implications:

  • Faster execution in general

  • Slower key generation

  • Larger keys

  • Larger memory usage during execution

To disable multi parameters, use parameter_selection_strategy=fhe.ParameterSelectionStrategy.MONO configuration option.

Multi precision

This document explains the multi-precision option for bit-width assignment for integers.

The multi-precision option enables the frontend to use the smallest bit-width possible for each operation in Fully Homomorphic Encryption (FHE), improving computation efficiency.

Bit-width and encoding differences

Each integer in the circuit has a certain bit-width, which is determined by the input-set. These bit-widths are visible when graphs are printed, for example:

However, adding integers with different bit-widths (for example, 3-bit and 4-bit numbers) directly isn't possible due to differences in encoding, as shown below:

When you add a 3-bit number and a 4-bit number, the result is a 5-bit number with a different encoding:

Bit-width assignment with graph processing

To address these encoding differences, a graph processing step called bit-width assignment is performed. This step updates the graph's bit-widths to ensure compatibility with Fully Homomorphic Encryption (FHE).

After this step, the graph might look like this:

Encoding flexibility with Table Lookup

Most operations cannot change the encoding, requiring the input and output bit-widths to remain the same. However, the table lookup operation can change the encoding. For example, consider the following graph:

This graph represents the computation (x**2) + y where x is 2-bits and y is 5-bits. Without the ability to change encodings, all bit-widths would need to be adjusted to 6-bits. However, since the encoding can change, bit-widths are assigned more efficiently:

In this case, x remains a 2-bit integer, but the Table Lookup result and y are set to 6-bits to allow for the addition.

Enabling and disabling multi-precision

This approach to bit-width assignment is known as multi-precision and is enabled by default. To disable multi-precision and enforce a single precision across the circuit, use the single_precision=True configuration option.

Progressbar

This document introduces the progressbar feature that provides visual feedback on the execution progress of large circuits, which can take considerable time to execute.

The following Python code demonstrates how to enable and use the progressbar:

When you run this code, you will see a progressbar like this one:

As the execution proceeds, the progress bar updates:

The progress bar does not measure time. When it shows 50%, it indicates that half of the nodes in the computation graph have been processed, not that half of the time has elapsed. The duration of processing different node types may vary, so the progress bar should not be used to estimate the remaining time.

Once the progressbar fills and execution completes, you will see the following figure:

Statistics

This document provides an overview of how to analyze compiled circuits and extract statistical data for performance evaluation in Concrete. These statistics help identify bottlenecks and compare circuits.

Basic operations

Concrete calculates statistics based on the following six basic operations:

  • Clear addition: x + y where x is encrypted and y is clear

  • Encrypted addition: x + y where both x and y are encrypted

  • Clear multiplication: x * y where x is encrypted and y is clear

  • Encrypted negation: -x where x is encrypted

  • Key switch: A building block for table lookups

  • Packing key switch: A building block for table lookups

  • Programmable bootstrapping: A building block for table lookups

Displaying statistics

You can print all statistics using the show_statistics configuration option:

This code will print:

Each of these properties can be directly accessed on the circuit (e.g., circuit.programmable_bootstrap_count).

Tags

Imagine you have a neural network with 10 layers, each of them tagged, you can easily see the number of additions and multiplications required for matrix multiplications per layer:

Configure

This document provides instructions on how to customize the compilation pipeline using Configurations in Python and describes various configuration options available.

You can customize Concrete using the fhe.Configuration :

You can overwrite individual configuration options by specifying kwargs in the compile method:

You can also combine both ways:

When options are specified both in the configuration and as kwargs in the compile method, the kwargs take precedence.

Options

approximate_rounding_config: ApproximateRoundingConfig = fhe.ApproximateRoundingConfig()

    • To enable exact clipping,

    • Or/and approximate clipping, which makes overflow protection faster.

auto_adjust_rounders: bool = False

  • Adjust rounders automatically.

auto_parallelize: bool = False

  • Enable auto parallelization in the compiler.

bitwise_strategy_preference: Optional[Union[BitwiseStrategy, str, List[Union[BitwiseStrategy, str]]]] = None

compiler_debug_mode: bool = False

  • Enable or disable the debug mode of the compiler. This can show a lot of information, including passes and pattern rewrites.

compiler_verbose_mode: bool = False

  • Enable or disable verbose mode of the compiler. This mainly shows logs from the compiler and is less verbose than the debug mode.

comparison_strategy_preference: Optional[Union[ComparisonStrategy, str, List[Union[ComparisonStrategy, str]]]] = None

compress_evaluation_keys: bool = False

  • Specify that serialization takes the compressed form of evaluation keys.

compress_input_ciphertexts: bool = False

  • Specify that serialization takes the compressed form of input ciphertexts.

composable: bool = False

  • Specify that the function must be composable with itself.

dataflow_parallelize: bool = False

  • Enable dataflow parallelization in the compiler.

dump_artifacts_on_unexpected_failures: bool = True

  • Export debugging artifacts automatically on compilation failures.

enable_tlu_fusing: bool = True

  • Enables Table Lookups(TLU) fusing to reduce the number of TLUs.

enable_unsafe_features: bool = False

  • Enable unsafe features.

fhe_execution: bool = True

  • Enable FHE execution. Can be enabled later using circuit.enable_fhe_execution().

fhe_simulation: bool = False

  • Enable FHE simulation. Can be enabled later using circuit.enable_fhe_simulation().

global_p_error: Optional[float] = None

  • Global error probability for the whole circuit.

if_then_else_chunk_size: int = 3

  • Chunk size to use when converting the fhe.if_then_else extension.

insecure_key_cache_location: Optional[Union[Path, str]] = None

  • Location of insecure key cache.

loop_parallelize: bool = True

  • Enable loop parallelization in the compiler.

multi_parameter_strategy: fhe.MultiParameterStrategy = fhe.MultiParameterStrategy.PRECISION

  • Set the level of circuit partitioning when using fhe.ParameterSelectionStrategy.MULTI.

    • PRECISION: all TLUs with the same input precision have their own parameters.

optimize_tlu_based_on_measured_bounds: bool = False

  • Enables TLU optimizations based on measured bounds.

  • Not enabled by default, as it could result in unexpected overflows during runtime.

optimize_tlu_based_on_original_bit_width: Union[bool, int] = 8

  • Configures whether to convert values to their original precision before doing a table lookup on them.

    • True enables it for all cases.

    • False disables it for all cases.

    • Integer value enables or disables it depending on the original bit width. With the default value of 8, only the values with original bit width ≤ 8 will be converted to their original precision.

p_error: Optional[float] = None

  • Error probability for individual table lookups.

parameter_selection_strategy: fhe.ParameterSelectionStrategy = fhe.ParameterSelectionStrategy.MULTI

  • Set how cryptographic parameters are selected.

print_tlu_fusing: bool = False

  • Enables printing of TLU fusing to see which table lookups are fused.

progress_tag: Union[bool, int] = False

  • How many nested tag elements to display with the progress bar.

    • True means all tag elements

    • False disables the display.

    • 2 will display elmt1.elmt2.

progress_title: str = ""

  • Title of the progress bar.

rounding_exactness: Exactness = fhe.Exactness.EXACT

  • Set default exactness mode for the rounding operation:

    • EXACT: threshold for rounding up or down is exactly centered between the upper and lower value.

relu_on_bits_chunk_size: int = 3

relu_on_bits_threshold: int = 7

shifts_with_promotion: bool = True

show_graph: Optional[bool] = None

  • Print computation graph during compilation.

    • True means always print

    • False means never print

    • None means print depending on verbose configuration.

show_mlir: Optional[bool] = None

  • Print MLIR during compilation.

    • True means always print

    • False means never print

    • None means print depending on verbose configuration.

show_optimizer: Optional[bool] = None

  • Print optimizer output during compilation.

    • True means always print

    • False means never print

    • None means print depending on verbose configuration.

show_progress: bool = False

  • Display a progress bar during circuit execution.

show_statistics: Optional[bool] = None

  • Print circuit statistics during compilation.

    • True means always print

    • False means never print

    • None means print depending on verbose configuration.

simulate_encrypt_run_decrypt: bool = False

  • Whether to use the simulate encrypt/run/decrypt methods of the circuit/module instead of actual encryption/evaluation/decryption.

    • When this option is set to True, encrypt and decrypt are identity functions, and run is a wrapper around simulation. In other words, this option allows switching off encryption to quickly test if a function has the expected semantic (without paying the price of FHE execution).

    • This is extremely unsafe and should only be used during development.

    • For this reason, it requires enable_unsafe_features to be set to True.

single_precision: bool = False

  • Use single precision for the whole circuit.

use_gpu: bool = False

  • Enable generating code for GPU in the compiler.

use_insecure_key_cache: bool = False (Unsafe)

  • Use the insecure key cache.

verbose: bool = False

  • Print details related to compilation.

: Compiler submodule.

: Client parameters.

: Client support.

: CompilationContext.

: Compilation feedback.

: CompilationOptions.

: EvaluationKeys.

: KeySet.

: KeySetCache.

: LambdaArgument.

: LibraryCompilationResult.

: LibraryLambda.

: LibrarySupport.

: LweSecretKey.

: Parameter.

: PublicArguments.

: PublicResult.

: ServerCircuit.

: ServerProgram.

: SimulatedValueDecrypter.

: SimulatedValueExporter.

: Import and export TFHErs integers into Concrete.

: Common utils for the compiler submodule.

: Value.

: ValueDecrypter.

: ValueExporter.

: Wrapper for native Cpp objects.

: Concrete.

: Glue the compilation process together.

: Declaration of DebugArtifacts class.

: Declaration of Circuit class.

: Declaration of Client class.

: Declaration of Compiler class.

: Declaration of classes related to composition.

: Declaration of Configuration class.

: Declaration of circuit and compiler decorators.

: Declaration of Keys class.

: Declaration of FheModule classes.

: Declaration of MultiCompiler class.

: Declaration of Server class.

: Declaration of ClientSpecs class.

: Declaration of EncryptionStatus class.

: Declaration of various functions and constants related to compilation.

: Declaration of Value class.

: Declaration of wiring related class.

: Define available data types and their semantics.

: Declaration of BaseDataType abstract class.

: Declaration of Float class.

: Declaration of Integer class.

: Declaration of various functions and constants related to data types.

: Provide additional features that are not present in numpy.

: Declaration of array function, to simplify creation of encrypted arrays.

: Bit extraction extensions.

: Declaration of constant functions, to allow server side trivial encryption.

: Tracing and evaluation of convolution.

: Declaration of hinting extensions, to provide more information to Concrete.

: Declaration of identity extension.

: Tracing and evaluation of maxpool.

: Declaration of multivariate extension.

: Declaration of ones and one functions, to simplify creation of encrypted ones.

: Declaration of relu extension.

: Declaration of round_bit_pattern function, to provide an interface for rounded table lookups.

: Declaration of LookupTable class.

: Declaration of tag context manager, to allow tagging certain nodes.

: Declaration of truncate_bit_pattern extension.

: Declaration of univariate function.

: Declaration of zeros and zero functions, to simplify creation of encrypted zeros.

.

: Declaration of various functions and constants related to the entire project.

: Provide computation graph to mlir functionality.

: Declaration of Context class.

: Declaration of ConversionType and Conversion classes.

: Declaration of Converter class.

: All graph processors.

: Declaration of AssignBitWidths graph processor.

: Declaration of AssignNodeIds graph processor.

: Declaration of CheckIntegerOnly graph processor.

: Declaration of ProcessRounding graph processor.

: Declaration of various functions and constants related to MLIR conversion.

: Define structures used to represent computation.

: Declaration of various Evaluator classes, to make graphs picklable.

: Declaration of Graph class.

: Declaration of Node class.

: Declaration of Operation enum.

: Declaration of various functions and constants related to representation of computation.

: tfhers module to represent, and compute on tfhers integer values.

: Declaration of tfhers.Bridge class.

: Declaration of TFHERSIntegerType class.

: Tracing of tfhers operations.

: Declaration of TFHERSInteger which wraps values as being of tfhers types.

: Provide function to computation graph functionality.

: Declaration of Tracer class.

: Declaration of type annotation.

: Define the available values and their semantics.

: Declaration of ClearScalar and EncryptedScalar wrappers.

: Declaration of ClearTensor and EncryptedTensor wrappers.

: Declaration of ValueDescription class.

: Concretelang python module

: FHE dialect module

: FHELinalg dialect module

: Tracing dialect module

: ClientParameters are public parameters used for key generation.

: Client interface for doing key generation and encryption.

: Support class for compilation context.

: CircuitCompilationFeedback is a set of hint computed by the compiler engine for a circuit.

: CompilationFeedback is a set of hint computed by the compiler engine.

: CompilationOptions holds different flags and options of the compilation process.

: EvaluationKeys required for execution.

: KeySet stores the different keys required for an encrypted computation.

: KeySetCache is a cache for KeySet to avoid generating similar keys multiple times.

: LambdaArgument holds scalar or tensor values.

: LibraryCompilationResult holds the result of the library compilation.

: LibraryLambda reference a compiled library and can be ran using LibrarySupport.

: Support class for library compilation and execution.

: An LweSecretKey.

: LWE Secret Key Parameters

: An FHE parameter.

: PublicArguments holds encrypted and plain arguments, as well as public materials.

: PublicResult holds the result of an encrypted execution and can be decrypted using ClientSupport.

: ServerCircuit references a circuit that can be called for execution and simulation.

: ServerProgram references compiled circuit objects.

: A helper class to decrypt Values.

: A helper class to create Values.

: A helper class to import and export TFHErs big integers.

: A helper class to create TfhersFheIntDescriptions.

: An encrypted/clear value which can be scalar/tensor.

: A helper class to decrypt Values.

: A helper class to create Values.

: Wrapper base class for native Cpp objects.

: DebugArtifacts class, to export information about the compilation process for single function.

: A debug manager, allowing streamlined debugging.

: An object containing debug artifacts for a certain function in an fhe module.

: An object containing debug artifacts for an fhe module.

: Circuit class, to combine computation graph, mlir, client and server into a single object.

: Client class, which can be used to manage keys, encrypt arguments and decrypt results.

: Compiler class, to glue the compilation pipeline.

: A raw composition clause.

: A protocol for composition policies.

: A raw composition rule.

: Controls the behavior of approximate rounding.

: Compilable class, to wrap a function and provide methods to trace and compile it.

: Keys class, to manage generate/reuse keys.

: Runtime object class for execution.

: Fhe function class, allowing to run or simulate one function of an fhe module.

: Fhe module class, to combine computation graphs, mlir, runtime objects into a single object.

: Runtime object class for simulation.

: An object representing the definition of a function as used in an fhe module.

: Compiler class for multiple functions, to glue the compilation pipeline.

: Server class, which can be used to perform homomorphic computation.

: ClientSpecs class, to create Client objects.

: EncryptionStatus enum, to represent encryption status of parameters.

: A lazyly initialized value.

: Value class, to store scalar or tensor values which can be encrypted or clear.

: Composition policy that allows to forward any output of the module to any of its input.

: All the encrypted inputs of a given function of a module.

: All the encrypted outputs of a given function of a module.

: The input of a given function of a module.

: Composition policy that does not allow the forwarding of any output to any input.

: The output of a given function of a module.

: A wrapper type used to trace wiring.

: A forwarding rule between an output and an input.

: A protocol for wire inputs.

: A protocol for wire outputs.

: A context manager returned by the wire_pipeline method.

: Composition policy which allows the forwarding of certain outputs to certain inputs.

: BaseDataType abstract class, to form a basis for data types.

: Float class, to represent floating point numbers.

: Integer class, to represent integers.

: Bits class, to provide indexing into the bits of integers.

: Adjusting class, to be used as early stop signal during adjustment.

: AutoRounder class, to optimize for number of msbs to keep during round bit pattern operation.

: LookupTable class, to provide a way to do direct table lookups.

: Adjusting class, to be used as early stop signal during adjustment.

: AutoTruncator class, to optimize for the number of msbs to keep during truncate operation.

: Context class, to perform operations on conversions.

: Conversion class, to store MLIR operations with additional information.

: ConversionType class, to make it easier to work with MLIR types.

: Converter class, to convert a computation graph to MLIR.

: AdditionalConstraints class to customize bit-width assignment step easily.

: AssignBitWidths graph processor, to assign proper bit-widths to be compatible with FHE.

to node properties.

: CheckIntegerOnly graph processor, to make sure the graph only contains integer nodes.

: ProcessRounding graph processor, to analyze rounding and support regular operations on it.

: Comparison enum, to store the result comparison in 2-bits as there are three possible outcomes.

: HashableNdarray class, to use numpy arrays in dictionaries.

: ConstantEvaluator class, to evaluate Operation.Constant nodes.

: GenericEvaluator class, to evaluate Operation.Generic nodes.

: GenericEvaluator class, to evaluate Operation.Generic nodes where args are packed in a tuple.

: InputEvaluator class, to evaluate Operation.Input nodes.

: Graph class, to represent computation graphs.

: GraphProcessor base class, to define the API for a graph processing pipeline.

: MultiGraphProcessor base class, to define the API for a multiple graph processing pipeline.

: Node class, to represent computation in a computation graph.

: Operation enum, to distinguish nodes within a computation graph.

: TFHErs Bridge extend a Circuit with TFHErs functionalities.

: Crypto parameters used for a tfhers integer.

: TFHErs key choice: big or small.

to represent tfhers integer types.

into typed values, using tfhers types.

: Base annotation for direct definition.

: Base scalar annotation for direct definition.

: Base tensor annotation for direct definition.

: Tracer class, to create computation graphs from python functions.

: Scalar f32 annotation.

: Scalar f64 annotation.

: Scalar int1 annotation.

: Scalar int10 annotation.

: Scalar int11 annotation.

: Scalar int12 annotation.

: Scalar int13 annotation.

: Scalar int14 annotation.

: Scalar int15 annotation.

: Scalar int16 annotation.

: Scalar int17 annotation.

: Scalar int18 annotation.

: Scalar int19 annotation.

: Scalar int2 annotation.

: Scalar int20 annotation.

: Scalar int21 annotation.

: Scalar int22 annotation.

: Scalar int23 annotation.

: Scalar int24 annotation.

: Scalar int25 annotation.

: Scalar int26 annotation.

: Scalar int27 annotation.

: Scalar int28 annotation.

: Scalar int29 annotation.

: Scalar int3 annotation.

: Scalar int30 annotation.

: Scalar int31 annotation.

: Scalar int32 annotation.

: Scalar int33 annotation.

: Scalar int34 annotation.

: Scalar int35 annotation.

: Scalar int36 annotation.

: Scalar int37 annotation.

: Scalar int38 annotation.

: Scalar int39 annotation.

: Scalar int4 annotation.

: Scalar int40 annotation.

: Scalar int41 annotation.

: Scalar int42 annotation.

: Scalar int43 annotation.

: Scalar int44 annotation.

: Scalar int45 annotation.

: Scalar int46 annotation.

: Scalar int47 annotation.

: Scalar int48 annotation.

: Scalar int49 annotation.

: Scalar int5 annotation.

: Scalar int50 annotation.

: Scalar int51 annotation.

: Scalar int52 annotation.

: Scalar int53 annotation.

: Scalar int54 annotation.

: Scalar int55 annotation.

: Scalar int56 annotation.

: Scalar int57 annotation.

: Scalar int58 annotation.

: Scalar int59 annotation.

: Scalar int6 annotation.

: Scalar int60 annotation.

: Scalar int61 annotation.

: Scalar int62 annotation.

: Scalar int63 annotation.

: Scalar int64 annotation.

: Scalar int7 annotation.

: Scalar int8 annotation.

: Scalar int9 annotation.

: Tensor annotation.

: Scalar uint1 annotation.

: Scalar uint10 annotation.

: Scalar uint11 annotation.

: Scalar uint12 annotation.

: Scalar uint13 annotation.

: Scalar uint14 annotation.

: Scalar uint15 annotation.

: Scalar uint16 annotation.

: Scalar uint17 annotation.

: Scalar uint18 annotation.

: Scalar uint19 annotation.

: Scalar uint2 annotation.

: Scalar uint20 annotation.

: Scalar uint21 annotation.

: Scalar uint22 annotation.

: Scalar uint23 annotation.

: Scalar uint24 annotation.

: Scalar uint25 annotation.

: Scalar uint26 annotation.

: Scalar uint27 annotation.

: Scalar uint28 annotation.

: Scalar uint29 annotation.

: Scalar uint3 annotation.

: Scalar uint30 annotation.

: Scalar uint31 annotation.

: Scalar uint32 annotation.

: Scalar uint33 annotation.

: Scalar uint34 annotation.

: Scalar uint35 annotation.

: Scalar uint36 annotation.

: Scalar uint37 annotation.

: Scalar uint38 annotation.

: Scalar uint39 annotation.

: Scalar uint4 annotation.

: Scalar uint40 annotation.

: Scalar uint41 annotation.

: Scalar uint42 annotation.

: Scalar uint43 annotation.

: Scalar uint44 annotation.

: Scalar uint45 annotation.

: Scalar uint46 annotation.

: Scalar uint47 annotation.

: Scalar uint48 annotation.

: Scalar uint49 annotation.

: Scalar uint5 annotation.

: Scalar uint50 annotation.

: Scalar uint51 annotation.

: Scalar uint52 annotation.

: Scalar uint53 annotation.

: Scalar uint54 annotation.

: Scalar uint55 annotation.

: Scalar uint56 annotation.

: Scalar uint57 annotation.

: Scalar uint58 annotation.

: Scalar uint59 annotation.

: Scalar uint6 annotation.

: Scalar uint60 annotation.

: Scalar uint61 annotation.

: Scalar uint62 annotation.

: Scalar uint63 annotation.

: Scalar uint64 annotation.

: Scalar uint7 annotation.

: Scalar uint8 annotation.

: Scalar uint9 annotation.

: ValueDescription class, to combine data type, shape, and encryption status into a single object.

: Check whether a CUDA device is available and online.

: Check whether the compiler and runtime support GPU offloading.

: Initialize dataflow parallelization.

: Parse the MLIR input, then return it back.

: Extract tag of the operation from its location.

: Try to find the absolute path to the runtime library.

: Provide a direct interface for compilation of single circuit programs.

: Provide an easy interface for the compilation of single-circuit programs.

: Provide an easy interface to define a function within an fhe module.

: Provide an easy interface for the compilation of multi functions modules.

: Add nodes from from_nodes to to_nodes, to all_nodes.

: Determine if a subgraph can be fused.

: Convert a subgraph to Operation.Generic node.

: Find the closest upstream integer output nodes to a set of start nodes in a graph.

: Find a subgraph with float computations that end with an integer output.

: Find the single lowest common ancestor of a list of nodes.

: Find a subgraph with a tlu computation that has multiple variable inputs where all variable inputs share a common ancestor.

: Convert a type to a string. Remove package name and class/type keywords.

: Fuse appropriate subgraphs in a graph to a single Operation.Generic node.

: Get the terminal size.

: Generate a random inputset.

: Determine if a node is the single common ancestor of a list of nodes.

: Validate input arguments.

: Get the 'BaseDataType' that can represent a set of 'BaseDataType's.

: Create an encrypted array from either encrypted or clear values.

: Extract bits of integers.

: Trivial encryption of a cleartext value.

: Trace and evaluate convolution operations.

: Hint the compilation process about properties of a value.

: Apply identity function to x.

: Refresh x.

: Evaluate or trace MaxPool operation.

: Wrap a multivariate function so that it is traced into a single generic node.

: Create an encrypted scalar with the value of one.

: Create an encrypted array of ones.

: Create an encrypted array of ones with the same shape as another array.

: Rectified linear unit extension.

: Round the bit pattern of an integer.

: Introduce a new tag to the tag stack.

: Round the bit pattern of an integer.

: Wrap a univariate function so that it is traced into a single generic node.

: Create an encrypted scalar with the value of zero.

: Create an encrypted array of zeros.

: Create an encrypted array of zeros with the same shape as another array.

: Assert a condition.

: Raise a RuntimeError to indicate unreachable code is entered.

: Construct lookup tables for each cell of the input for an Operation.Generic node.

: Construct the lookup table for an Operation.Generic node.

: Construct the lookup table for a multivariate node.

: Use flooding algorithm to replace None values.

: Get the textual representation of a constant.

: Format an indexing element.

: Create a TFHErs bridge from a circuit.

: Convert a Concrete integer to the tfhers representation.

: Convert a tfhers integer to the Concrete representation.

: Build a clear scalar value.

: Build an encrypted scalar value.

: Build a clear scalar value.

: Build an encrypted scalar value.

: Build a clear tensor value.

: Build an encrypted tensor value.

: Build a clear tensor value.

: Build an encrypted tensor value.

Concrete partly supports floating points. There is no support for floating point inputs or outputs. However, there is support for intermediate values to be floating points (under certain constraints). Also, we note that one can use an equivalent of fixed points in Concrete, as described in .

Specify the resulting data type in extension (e.g., fhe.univariate(function, outputs=fhe.uint4)(x)), for the same reason as above.

In particular, wheels with GPU support are not on PyPI. You can install it from our package index by adding --extra-index-url https://pypi.zama.ai/gpu, more information on GPU wheels .

The full version requires , which depends on . Make sure to all the dependencies on your operating system before installing concrete-python[full].

Installing pygraphviz on macOS can be problematic (see more details ).

This document provides an overview of Fully Homomorphic Encryption (FHE) to get you started with Concrete. For more comprehensive resources about FHE, visit or .

Homomorphic encryption allows computations on ciphertexts without revealing the underlying plaintexts. A scheme is considered if it supports an unlimited number of additions and multiplications.

Let represent a plaintext and the corresponding ciphertext:

Homomorphic addition:

Homomorphic multiplication:

In Concrete, the default failure probability is set to , meaning that 1 in every 100,000 executions may result in an error. Reducing this probability requires adjusting cryptographic parameters, potentially lowering performance. Conversely, allowing a higher probability of error may improve performance.

While we’ve covered arithmetic operations, typical programs also involve functions (for example, maximum, minimum, square root). In , the Bootstrap operation can be enhanced with a , creating a Programmable Bootstrap (PBS).

Homomorphic univariate function evaluation:

PBS performance impact: PBS operations are costly, so minimizing the number of PBS can improve circuit performance. PBS cost also varies with input precision (for example, an 8-bit PBS is faster than a 16-bit PBS). To learn more about optimizing PBS, refer to the section.

extensions: custom functionality (see )

The Concrete backends are implementations of the cryptographic primitives of the Zama variant of . The compiler emits code which combines call into these backends to perform more complex homomorphic operations.

The Concrete backends are implementations of the cryptographic primitives of the Zama variant of .

Compilation of a Python program starts with Concrete's Python frontend, which first traces and transforms it and then converts it into an intermediate representation (IR) that is further processed by Concrete Compiler. This IR is based on the of the . This document provides an overview of Concrete's FHE-specific representations based on the MLIR framework.

In contrast to traditional infrastructure for compilers, the set of operations and data types that constitute the IR, as well as the level of abstraction that the IR represents, are not fixed in MLIR and can easily be extended. All operations and data types are grouped into , with each dialect representing a specific domain or a specific level of abstraction. Mixing operations and types from different dialects within the same IR is allowed and even encouraged, with all dialects--builtin or developed as an extension--being first-class citizens.

The FHELinalg Dialect (, )

The FHE Dialect (, )

The TFHE Dialect (, )

The Concrete Dialect (, )

and for debugging purposes, the Tracing Dialect (, ).

The RT Dialect (, ) and

The SDFG Dialect (, ).

In a first lowering step of the pipeline, all FHELinalg operations are lowered to operations from using scalar operations from the FHE Dialect. Consider the following example, which consists of a function that performs a multiplication of a matrix of encrypted integers and a matrix of cleartext values:

This is then further lowered to a nest of loops from , implementing the parallel and reduction dimensions from the linalg.generic operation above:

The remaining stages of the pipeline are rather technical. Before any binding to an actual Concrete backend library, the compiler first invokes to convert the value-based IR into an IR with buffer semantics. In particular, this means that keys and encrypted values are no longer abstract values in a mathematical sense, but values backed by a memory location that holds the actual data. This form of IR is then suitable for a pass emitting actual library calls that implement the corresponding operations from the Concrete Dialect for a specific backend.

When enabled, you can control the level of circuit partitioning by setting the multi_parameter_strategy as described in .

You can also use tags to analyze specific sections of your circuit. See more detailed explanation in .

Provide fine control for :

Specify preference for bitwise strategies, can be a single strategy or an ordered list of strategies. See to learn more.

Specify preference for comparison strategies. Can be a single strategy or an ordered list of strategies. See to learn more.

Only used when compiling a single circuit; when compiling modules, use the .

If set, the whole circuit will have the probability of a non-exact result smaller than the set value. See to learn more.

PRECISION_AND_NORM2: all TLUs with the same input precision and output have their own parameters.

If set, all table lookups will have the probability of a non-exact result smaller than the set value. See to learn more.

APPROXIMATE: faster but threshold for rounding up or down is approximately centered with a pseudo-random shift. Precise behavior is described in .

Chunk size of the ReLU extension when implementation is used.

Bit-width to start implementing the ReLU extension with .

Enable promotions in encrypted shifts instead of casting at runtime. See to learn more.

concrete.compiler
concrete.compiler.client_parameters
concrete.compiler.client_support
concrete.compiler.compilation_context
concrete.compiler.compilation_feedback
concrete.compiler.compilation_options
concrete.compiler.evaluation_keys
concrete.compiler.key_set
concrete.compiler.key_set_cache
concrete.compiler.lambda_argument
concrete.compiler.library_compilation_result
concrete.compiler.library_lambda
concrete.compiler.library_support
concrete.compiler.lwe_secret_key
concrete.compiler.parameter
concrete.compiler.public_arguments
concrete.compiler.public_result
concrete.compiler.server_circuit
concrete.compiler.server_program
concrete.compiler.simulated_value_decrypter
concrete.compiler.simulated_value_exporter
concrete.compiler.tfhers_int
concrete.compiler.utils
concrete.compiler.value
concrete.compiler.value_decrypter
concrete.compiler.value_exporter
concrete.compiler.wrapper
concrete.fhe
concrete.fhe.compilation
concrete.fhe.compilation.artifacts
concrete.fhe.compilation.circuit
concrete.fhe.compilation.client
concrete.fhe.compilation.compiler
concrete.fhe.compilation.composition
concrete.fhe.compilation.configuration
concrete.fhe.compilation.decorators
concrete.fhe.compilation.keys
concrete.fhe.compilation.module
concrete.fhe.compilation.module_compiler
concrete.fhe.compilation.server
concrete.fhe.compilation.specs
concrete.fhe.compilation.status
concrete.fhe.compilation.utils
concrete.fhe.compilation.value
concrete.fhe.compilation.wiring
concrete.fhe.dtypes
concrete.fhe.dtypes.base
concrete.fhe.dtypes.float
concrete.fhe.dtypes.integer
concrete.fhe.dtypes.utils
concrete.fhe.extensions
concrete.fhe.extensions.array
concrete.fhe.extensions.bits
concrete.fhe.extensions.constant
concrete.fhe.extensions.convolution
concrete.fhe.extensions.hint
concrete.fhe.extensions.identity
concrete.fhe.extensions.maxpool
concrete.fhe.extensions.multivariate
concrete.fhe.extensions.ones
concrete.fhe.extensions.relu
concrete.fhe.extensions.round_bit_pattern
concrete.fhe.extensions.table
concrete.fhe.extensions.tag
concrete.fhe.extensions.truncate_bit_pattern
concrete.fhe.extensions.univariate
concrete.fhe.extensions.zeros
concrete.fhe.internal
concrete.fhe.internal.utils
concrete.fhe.mlir
concrete.fhe.mlir.context
concrete.fhe.mlir.conversion
concrete.fhe.mlir.converter
concrete.fhe.mlir.processors
concrete.fhe.mlir.processors.assign_bit_widths
concrete.fhe.mlir.processors.assign_node_ids
concrete.fhe.mlir.processors.check_integer_only
concrete.fhe.mlir.processors.process_rounding
concrete.fhe.mlir.utils
concrete.fhe.representation
concrete.fhe.representation.evaluator
concrete.fhe.representation.graph
concrete.fhe.representation.node
concrete.fhe.representation.operation
concrete.fhe.representation.utils
concrete.fhe.tfhers
concrete.fhe.tfhers.bridge
concrete.fhe.tfhers.dtypes
concrete.fhe.tfhers.tracing
concrete.fhe.tfhers.values
concrete.fhe.tracing
concrete.fhe.tracing.tracer
concrete.fhe.tracing.typing
concrete.fhe.values
concrete.fhe.values.scalar
concrete.fhe.values.tensor
concrete.fhe.values.value_description
concrete.fhe.version
concrete.lang
concrete.lang.dialects
concrete.lang.dialects.fhe
concrete.lang.dialects.fhelinalg
concrete.lang.dialects.tracing
client_parameters.ClientParameters
client_support.ClientSupport
compilation_context.CompilationContext
compilation_feedback.CircuitCompilationFeedback
compilation_feedback.ProgramCompilationFeedback
compilation_options.CompilationOptions
evaluation_keys.EvaluationKeys
key_set.KeySet
key_set_cache.KeySetCache
lambda_argument.LambdaArgument
library_compilation_result.LibraryCompilationResult
library_lambda.LibraryLambda
library_support.LibrarySupport
lwe_secret_key.LweSecretKey
lwe_secret_key.LweSecretKeyParam
parameter.Parameter
public_arguments.PublicArguments
public_result.PublicResult
server_circuit.ServerCircuit
server_program.ServerProgram
simulated_value_decrypter.SimulatedValueDecrypter
simulated_value_exporter.SimulatedValueExporter
tfhers_int.TfhersExporter
tfhers_int.TfhersFheIntDescription
value.Value
value_decrypter.ValueDecrypter
value_exporter.ValueExporter
wrapper.WrapperCpp
artifacts.DebugArtifacts
artifacts.DebugManager
artifacts.FunctionDebugArtifacts
artifacts.ModuleDebugArtifacts
circuit.Circuit
client.Client
compiler.Compiler
composition.CompositionClause
composition.CompositionPolicy
composition.CompositionRule
configuration.ApproximateRoundingConfig
decorators.Compilable
keys.Keys
module.ExecutionRt
module.FheFunction
module.FheModule
module.SimulationRt
module_compiler.FunctionDef
module_compiler.ModuleCompiler
server.Server
specs.ClientSpecs
status.EncryptionStatus
utils.Lazy
value.Value
wiring.AllComposable
wiring.AllInputs
wiring.AllOutputs
wiring.Input
wiring.NotComposable
wiring.Output
wiring.TracedOutput
wiring.Wire
wiring.WireInput
wiring.WireOutput
wiring.WireTracingContextManager
wiring.Wired
base.BaseDataType
float.Float
integer.Integer
bits.Bits
round_bit_pattern.Adjusting
round_bit_pattern.AutoRounder
table.LookupTable
truncate_bit_pattern.Adjusting
truncate_bit_pattern.AutoTruncator
context.Context
conversion.Conversion
conversion.ConversionType
converter.Converter
assign_bit_widths.AdditionalConstraints
assign_bit_widths.AssignBitWidths
assign_node_ids.AssignNodeIds
check_integer_only.CheckIntegerOnly
process_rounding.ProcessRounding
utils.Comparison
utils.HashableNdarray
evaluator.ConstantEvaluator
evaluator.GenericEvaluator
evaluator.GenericTupleEvaluator
evaluator.InputEvaluator
graph.Graph
graph.GraphProcessor
graph.MultiGraphProcessor
node.Node
operation.Operation
bridge.Bridge
dtypes.CryptoParams
dtypes.EncryptionKeyChoice
dtypes.TFHERSIntegerType
values.TFHERSInteger
tracer.Annotation
tracer.ScalarAnnotation
tracer.TensorAnnotation
tracer.Tracer
typing.f32
typing.f64
typing.int1
typing.int10
typing.int11
typing.int12
typing.int13
typing.int14
typing.int15
typing.int16
typing.int17
typing.int18
typing.int19
typing.int2
typing.int20
typing.int21
typing.int22
typing.int23
typing.int24
typing.int25
typing.int26
typing.int27
typing.int28
typing.int29
typing.int3
typing.int30
typing.int31
typing.int32
typing.int33
typing.int34
typing.int35
typing.int36
typing.int37
typing.int38
typing.int39
typing.int4
typing.int40
typing.int41
typing.int42
typing.int43
typing.int44
typing.int45
typing.int46
typing.int47
typing.int48
typing.int49
typing.int5
typing.int50
typing.int51
typing.int52
typing.int53
typing.int54
typing.int55
typing.int56
typing.int57
typing.int58
typing.int59
typing.int6
typing.int60
typing.int61
typing.int62
typing.int63
typing.int64
typing.int7
typing.int8
typing.int9
typing.tensor
typing.uint1
typing.uint10
typing.uint11
typing.uint12
typing.uint13
typing.uint14
typing.uint15
typing.uint16
typing.uint17
typing.uint18
typing.uint19
typing.uint2
typing.uint20
typing.uint21
typing.uint22
typing.uint23
typing.uint24
typing.uint25
typing.uint26
typing.uint27
typing.uint28
typing.uint29
typing.uint3
typing.uint30
typing.uint31
typing.uint32
typing.uint33
typing.uint34
typing.uint35
typing.uint36
typing.uint37
typing.uint38
typing.uint39
typing.uint4
typing.uint40
typing.uint41
typing.uint42
typing.uint43
typing.uint44
typing.uint45
typing.uint46
typing.uint47
typing.uint48
typing.uint49
typing.uint5
typing.uint50
typing.uint51
typing.uint52
typing.uint53
typing.uint54
typing.uint55
typing.uint56
typing.uint57
typing.uint58
typing.uint59
typing.uint6
typing.uint60
typing.uint61
typing.uint62
typing.uint63
typing.uint64
typing.uint7
typing.uint8
typing.uint9
value_description.ValueDescription
compiler.check_gpu_available
compiler.check_gpu_enabled
compiler.init_dfr
compiler.round_trip
compilation_feedback.tag_from_location
utils.lookup_runtime_lib
decorators.circuit
decorators.compiler
decorators.function
decorators.module
utils.add_nodes_from_to
utils.check_subgraph_fusibility
utils.convert_subgraph_to_subgraph_node
utils.find_closest_integer_output_nodes
utils.find_float_subgraph_with_unique_terminal_node
utils.find_single_lca
utils.find_tlu_subgraph_with_multiple_variable_inputs_that_has_a_single_common_ancestor
utils.friendly_type_format
utils.fuse
utils.get_terminal_size
utils.inputset
utils.is_single_common_ancestor
utils.validate_input_args
utils.combine_dtypes
array.array
bits.bits
constant.constant
convolution.conv
hint.hint
identity.identity
identity.refresh
maxpool.maxpool
multivariate.multivariate
ones.one
ones.ones
ones.ones_like
relu.relu
round_bit_pattern.round_bit_pattern
tag.tag
truncate_bit_pattern.truncate_bit_pattern
univariate.univariate
zeros.zero
zeros.zeros
zeros.zeros_like
utils.assert_that
utils.unreachable
utils.construct_deduplicated_tables
utils.construct_table
utils.construct_table_multivariate
utils.flood_replace_none_values
utils.format_constant
utils.format_indexing_element
bridge.new_bridge
tracing.from_native
tracing.to_native
scalar.clear_scalar_builder
scalar.encrypted_scalar_builder
scalar.clear_scalar_builder
scalar.encrypted_scalar_builder
tensor.clear_tensor_builder
tensor.encrypted_tensor_builder
tensor.clear_tensor_builder
tensor.encrypted_tensor_builder
our tutorial
__abs__
__add__
__and__
__eq__
__floordiv__
__ge__
__getitem__
__gt__
__invert__
__le__
__lshift__
__lt__
__matmul__
__mod__
__mul__
__ne__
__neg__
__or__
__pos__
__pow__
__radd__
__rand__
__rfloordiv__
__rlshift__
__rmatmul__
__rmod__
__rmul__
__ror__
__round__
__rpow__
__rrshift__
__rshift__
__rsub__
__rtruediv__
__rxor__
__sub__
__truediv__
__xor__
np.absolute
np.add
np.arccos
np.arccosh
np.arcsin
np.arcsinh
np.arctan
np.arctan2
np.arctanh
np.around
np.bitwise_and
np.bitwise_or
np.bitwise_xor
np.broadcast_to
np.cbrt
np.ceil
np.clip
np.concatenate
np.copysign
np.cos
np.cosh
np.deg2rad
np.degrees
np.dot
np.equal
np.exp
np.exp2
np.expand_dims
np.expm1
np.fabs
np.float_power
np.floor
np.floor_divide
np.fmax
np.fmin
np.fmod
np.gcd
np.greater
np.greater_equal
np.heaviside
np.hypot
np.invert
np.isfinite
np.isinf
np.isnan
np.lcm
np.ldexp
np.left_shift
np.less
np.less_equal
np.log
np.log10
np.log1p
np.log2
np.logaddexp
np.logaddexp2
np.logical_and
np.logical_not
np.logical_or
np.logical_xor
np.matmul
np.max
np.maximum
np.min
np.minimum
np.multiply
np.negative
np.nextafter
np.not_equal
np.ones_like
np.positive
np.power
np.rad2deg
np.radians
np.reciprocal
np.remainder
np.reshape
np.right_shift
np.rint
np.round
np.sign
np.signbit
np.sin
np.sinh
np.spacing
np.sqrt
np.square
np.subtract
np.sum
np.tan
np.tanh
np.transpose
np.true_divide
np.trunc
np.where
np.zeros_like
np.ndarray.astype
np.ndarray.clip
np.ndarray.dot
np.ndarray.flatten
np.ndarray.reshape
np.ndarray.transpose
np.ndarray.shape
np.ndarray.ndim
np.ndarray.size
np.ndarray.T
here
pygraphviz
graphviz
install
here
univariate
xxx
E[x]E[x]E[x]
E[x]+E[y]=E[x+y]E[x] + E[y] = E[x + y]E[x]+E[y]=E[x+y]
E[x]∗E[y]=E[x∗y]E[x] * E[y] = E[x * y]E[x]∗E[y]=E[x∗y]
1100000\frac{1}{100000}1000001​
f(E[x])=E[f(x)]f(E[x]) = E[f(x)]f(E[x])=E[f(x)]
import numpy as np

def encrypted_max(x: uint4):
    return np.maximum(5, x)
def encrypted_max(x: uint4):
    lut = [5, 5, 5, 5, 5, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15]
    return lut[x]
module {
  func.func @main(%arg0: !FHE.eint<4>) -> !FHE.eint<4> {
    %cst = arith.constant dense<[5, 5, 5, 5, 5, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15]> : tensor<16xi64>
    %0 = "FHE.apply_lookup_table"(%arg0, %cst) : (!FHE.eint<4>, tensor<16xi64>) -> !FHE.eint<4>
    return %0 : !FHE.eint<4>
  }
}
FHE Overview
constraints
def g(z):
    with fhe.tag("def"):
        a = 120 - z
        b = a // 4
    return b


def f(x):
    with fhe.tag("abc"):
        x = x * 2
        with fhe.tag("foo"):
            y = x + 42
        z = np.sqrt(y).astype(np.int64)

    return g(z + 3) * 2
 %0 = x                            # EncryptedScalar<uint4>        ∈ [0, 9]
 %1 = 2                            # ClearScalar<uint2>            ∈ [2, 2]            @ abc
 %2 = multiply(%0, %1)             # EncryptedScalar<uint5>        ∈ [0, 18]           @ abc
 %3 = 42                           # ClearScalar<uint6>            ∈ [42, 42]          @ abc.foo
 %4 = add(%2, %3)                  # EncryptedScalar<uint6>        ∈ [42, 60]          @ abc.foo
 %5 = subgraph(%4)                 # EncryptedScalar<uint3>        ∈ [6, 7]            @ abc
 %6 = 3                            # ClearScalar<uint2>            ∈ [3, 3]
 %7 = add(%5, %6)                  # EncryptedScalar<uint4>        ∈ [9, 10]
 %8 = 120                          # ClearScalar<uint7>            ∈ [120, 120]        @ def
 %9 = subtract(%8, %7)             # EncryptedScalar<uint7>        ∈ [110, 111]        @ def
%10 = 4                            # ClearScalar<uint3>            ∈ [4, 4]            @ def
%11 = floor_divide(%9, %10)        # EncryptedScalar<uint5>        ∈ [27, 27]          @ def
%12 = 2                            # ClearScalar<uint2>            ∈ [2, 2]
%13 = multiply(%11, %12)           # EncryptedScalar<uint6>        ∈ [54, 54]
return %13

Subgraphs:

    %5 = subgraph(%4):

        %0 = input                         # EncryptedScalar<uint2>          @ abc.foo
        %1 = sqrt(%0)                      # EncryptedScalar<float64>        @ abc
        %2 = astype(%1, dtype=int_)        # EncryptedScalar<uint1>          @ abc
        return %2
-DCONCRETELANG_CUDA_SUPPORT=${CUDA_SUPPORT}
-DCUDAToolkit_ROOT=$(CUDA_PATH)
func.func @main(%arg0: tensor<4x3x!FHE.eint<2>>, %arg1: tensor<3x2xi3>) -> tensor<4x2x!FHE.eint<2>> {
  %0 = "FHELinalg.matmul_eint_int"(%arg0, %arg1) : (tensor<4x3x!FHE.eint<2>>, tensor<3x2xi3>) -> tensor<4x2x!FHE.eint<2>>
  return %0 : tensor<4x2x!FHE.eint<2>>
}
#map = affine_map<(d0, d1, d2) -> (d0, d2)>
#map1 = affine_map<(d0, d1, d2) -> (d2, d1)>
#map2 = affine_map<(d0, d1, d2) -> (d0, d1)>

func.func @main(%arg0: tensor<4x3x!FHE.eint<2>>, %arg1: tensor<3x2xi3>) -> tensor<4x2x!FHE.eint<2>> {
  %0 = "FHE.zero_tensor"() : () -> tensor<4x2x!FHE.eint<2>>
  %1 = linalg.generic {indexing_maps = [#map, #map1, #map2], iterator_types = ["parallel", "parallel", "reduction"]} ins(%arg0, %arg1 : tensor<4x3x!FHE.eint<2>>, tensor<3x2xi3>) outs(%0 : tensor<4x2x!FHE.eint<2>>) {
  ^bb0(%in: !FHE.eint<2>, %in_0: i3, %out: !FHE.eint<2>):
    %2 = "FHE.mul_eint_int"(%in, %in_0) : (!FHE.eint<2>, i3) -> !FHE.eint<2>
    %3 = "FHE.add_eint"(%out, %2) : (!FHE.eint<2>, !FHE.eint<2>) -> !FHE.eint<2>
    linalg.yield %3 : !FHE.eint<2>
  } -> tensor<4x2x!FHE.eint<2>>
  return %1 : tensor<4x2x!FHE.eint<2>>
}
func.func @main(%arg0: tensor<4x3x!FHE.eint<2>>, %arg1: tensor<3x2xi3>) -> tensor<4x2x!FHE.eint<2>> {
  %c0 = arith.constant 0 : index
  %c4 = arith.constant 4 : index
  %c1 = arith.constant 1 : index
  %c2 = arith.constant 2 : index
  %c3 = arith.constant 3 : index
  %0 = "FHE.zero_tensor"() : () -> tensor<4x2x!FHE.eint<2>>
  %1 = scf.for %arg2 = %c0 to %c4 step %c1 iter_args(%arg3 = %0) -> (tensor<4x2x!FHE.eint<2>>) {
    %2 = scf.for %arg4 = %c0 to %c2 step %c1 iter_args(%arg5 = %arg3) -> (tensor<4x2x!FHE.eint<2>>) {
      %3 = scf.for %arg6 = %c0 to %c3 step %c1 iter_args(%arg7 = %arg5) -> (tensor<4x2x!FHE.eint<2>>) {
        %extracted = tensor.extract %arg0[%arg2, %arg6] : tensor<4x3x!FHE.eint<2>>
        %extracted_0 = tensor.extract %arg1[%arg6, %arg4] : tensor<3x2xi3>
        %extracted_1 = tensor.extract %arg7[%arg2, %arg4] : tensor<4x2x!FHE.eint<2>>
        %4 = "FHE.mul_eint_int"(%extracted, %extracted_0) : (!FHE.eint<2>, i3) -> !FHE.eint<2>
        %5 = "FHE.add_eint"(%extracted_1, %4) : (!FHE.eint<2>, !FHE.eint<2>) -> !FHE.eint<2>
        %inserted = tensor.insert %5 into %arg7[%arg2, %arg4] : tensor<4x2x!FHE.eint<2>>
        scf.yield %inserted : tensor<4x2x!FHE.eint<2>>
      }
      scf.yield %3 : tensor<4x2x!FHE.eint<2>>
    }
    scf.yield %2 : tensor<4x2x!FHE.eint<2>>
  }
  return %1 : tensor<4x2x!FHE.eint<2>>
}
func.func @main(%arg0: tensor<4x3x!TFHE.glwe<sk?>>, %arg1: tensor<3x2xi3>) -> tensor<4x2x!TFHE.glwe<sk?>> {
  %c0 = arith.constant 0 : index
  %c4 = arith.constant 4 : index
  %c1 = arith.constant 1 : index
  %c2 = arith.constant 2 : index
  %c3 = arith.constant 3 : index
  %0 = "TFHE.zero_tensor"() : () -> tensor<4x2x!TFHE.glwe<sk?>>
  %1 = scf.for %arg2 = %c0 to %c4 step %c1 iter_args(%arg3 = %0) -> (tensor<4x2x!TFHE.glwe<sk?>>) {
    %2 = scf.for %arg4 = %c0 to %c2 step %c1 iter_args(%arg5 = %arg3) -> (tensor<4x2x!TFHE.glwe<sk?>>) {
      %3 = scf.for %arg6 = %c0 to %c3 step %c1 iter_args(%arg7 = %arg5) -> (tensor<4x2x!TFHE.glwe<sk?>>) {
        %extracted = tensor.extract %arg0[%arg2, %arg6] : tensor<4x3x!TFHE.glwe<sk?>>
        %extracted_0 = tensor.extract %arg1[%arg6, %arg4] : tensor<3x2xi3>
        %extracted_1 = tensor.extract %arg7[%arg2, %arg4] : tensor<4x2x!TFHE.glwe<sk?>>
        %4 = arith.extsi %extracted_0 : i3 to i64
        %5 = "TFHE.mul_glwe_int"(%extracted, %4) : (!TFHE.glwe<sk?>, i64) -> !TFHE.glwe<sk?>
        %6 = "TFHE.add_glwe"(%extracted_1, %5) : (!TFHE.glwe<sk?>, !TFHE.glwe<sk?>) -> !TFHE.glwe<sk?>
        %inserted = tensor.insert %6 into %arg7[%arg2, %arg4] : tensor<4x2x!TFHE.glwe<sk?>>
        scf.yield %inserted : tensor<4x2x!TFHE.glwe<sk?>>
      }
      scf.yield %3 : tensor<4x2x!TFHE.glwe<sk?>>
    }
    scf.yield %2 : tensor<4x2x!TFHE.glwe<sk?>>
  }
  return %1 : tensor<4x2x!TFHE.glwe<sk?>>
}
func.func @main(%arg0: tensor<4x3x!TFHE.glwe<sk<0,1,512>>>, %arg1: tensor<3x2xi3>) -> tensor<4x2x!TFHE.glwe<sk<0,1,512>>> {
  %c0 = arith.constant 0 : index
  %c4 = arith.constant 4 : index
  %c1 = arith.constant 1 : index
  %c2 = arith.constant 2 : index
  %c3 = arith.constant 3 : index
  %0 = "TFHE.zero_tensor"() : () -> tensor<4x2x!TFHE.glwe<sk<0,1,512>>>
  %1 = scf.for %arg2 = %c0 to %c4 step %c1 iter_args(%arg3 = %0) -> (tensor<4x2x!TFHE.glwe<sk<0,1,512>>>) {
    %2 = scf.for %arg4 = %c0 to %c2 step %c1 iter_args(%arg5 = %arg3) -> (tensor<4x2x!TFHE.glwe<sk<0,1,512>>>) {
      %3 = scf.for %arg6 = %c0 to %c3 step %c1 iter_args(%arg7 = %arg5) -> (tensor<4x2x!TFHE.glwe<sk<0,1,512>>>) {
        %extracted = tensor.extract %arg0[%arg2, %arg6] : tensor<4x3x!TFHE.glwe<sk<0,1,512>>>
        %extracted_0 = tensor.extract %arg1[%arg6, %arg4] : tensor<3x2xi3>
        %extracted_1 = tensor.extract %arg7[%arg2, %arg4] : tensor<4x2x!TFHE.glwe<sk<0,1,512>>>
        %4 = arith.extsi %extracted_0 : i3 to i64
        %5 = "TFHE.mul_glwe_int"(%extracted, %4) : (!TFHE.glwe<sk<0,1,512>>, i64) -> !TFHE.glwe<sk<0,1,512>>
        %6 = "TFHE.add_glwe"(%extracted_1, %5) : (!TFHE.glwe<sk<0,1,512>>, !TFHE.glwe<sk<0,1,512>>) -> !TFHE.glwe<sk<0,1,512>>
        %inserted = tensor.insert %6 into %arg7[%arg2, %arg4] : tensor<4x2x!TFHE.glwe<sk<0,1,512>>>
        scf.yield %inserted : tensor<4x2x!TFHE.glwe<sk<0,1,512>>>
      }
      scf.yield %3 : tensor<4x2x!TFHE.glwe<sk<0,1,512>>>
    }
    scf.yield %2 : tensor<4x2x!TFHE.glwe<sk<0,1,512>>>
  }
  return %1 : tensor<4x2x!TFHE.glwe<sk<0,1,512>>>
}
func.func @main(%arg0: tensor<4x3x513xi64>, %arg1: tensor<3x2xi3>) -> tensor<4x2x513xi64> {
  %c0 = arith.constant 0 : index
  %c4 = arith.constant 4 : index
  %c1 = arith.constant 1 : index
  %c2 = arith.constant 2 : index
  %c3 = arith.constant 3 : index
  %generated = tensor.generate  {
  ^bb0(%arg2: index, %arg3: index, %arg4: index):
    %c0_i64 = arith.constant 0 : i64
    tensor.yield %c0_i64 : i64
  } : tensor<4x2x513xi64>
  %0 = scf.for %arg2 = %c0 to %c4 step %c1 iter_args(%arg3 = %generated) -> (tensor<4x2x513xi64>) {
    %1 = scf.for %arg4 = %c0 to %c2 step %c1 iter_args(%arg5 = %arg3) -> (tensor<4x2x513xi64>) {
      %2 = scf.for %arg6 = %c0 to %c3 step %c1 iter_args(%arg7 = %arg5) -> (tensor<4x2x513xi64>) {
        %extracted_slice = tensor.extract_slice %arg0[%arg2, %arg6, 0] [1, 1, 513] [1, 1, 1] : tensor<4x3x513xi64> to tensor<513xi64>
        %extracted = tensor.extract %arg1[%arg6, %arg4] : tensor<3x2xi3>
        %extracted_slice_0 = tensor.extract_slice %arg7[%arg2, %arg4, 0] [1, 1, 513] [1, 1, 1] : tensor<4x2x513xi64> to tensor<513xi64>
        %3 = arith.extsi %extracted : i3 to i64
        %4 = "Concrete.mul_cleartext_lwe_tensor"(%extracted_slice, %3) : (tensor<513xi64>, i64) -> tensor<513xi64>
        %5 = "Concrete.add_lwe_tensor"(%extracted_slice_0, %4) : (tensor<513xi64>, tensor<513xi64>) -> tensor<513xi64>
        %inserted_slice = tensor.insert_slice %5 into %arg7[%arg2, %arg4, 0] [1, 1, 513] [1, 1, 1] : tensor<513xi64> into tensor<4x2x513xi64>
        scf.yield %inserted_slice : tensor<4x2x513xi64>
      }
      scf.yield %2 : tensor<4x2x513xi64>
    }
    scf.yield %1 : tensor<4x2x513xi64>
  }
  return %0 : tensor<4x2x513xi64>
}
func.func @main(%arg0: memref<4x3x513xi64, strided<[?, ?, ?], offset: ?>>, %arg1: memref<3x2xi3, strided<[?, ?], offset: ?>>, %arg2: !Concrete.context) -> memref<4x2x513xi64> {
  %c0_i64 = arith.constant 0 : i64
  call @_dfr_start(%c0_i64, %arg2) : (i64, !Concrete.context) -> ()
  %c0 = arith.constant 0 : index
  %c4 = arith.constant 4 : index
  %c1 = arith.constant 1 : index
  %c2 = arith.constant 2 : index
  %c513 = arith.constant 513 : index
  %c0_i64_0 = arith.constant 0 : i64
  %c3 = arith.constant 3 : index
  %alloc = memref.alloc() {alignment = 64 : i64} : memref<4x2x513xi64>
  scf.for %arg3 = %c0 to %c4 step %c1 {
    scf.for %arg4 = %c0 to %c2 step %c1 {
      scf.for %arg5 = %c0 to %c513 step %c1 {
        memref.store %c0_i64_0, %alloc[%arg3, %arg4, %arg5] : memref<4x2x513xi64>
      }
    }
  }
  scf.for %arg3 = %c0 to %c4 step %c1 {
    scf.for %arg4 = %c0 to %c2 step %c1 {
      %subview = memref.subview %alloc[%arg3, %arg4, 0] [1, 1, 513] [1, 1, 1] : memref<4x2x513xi64> to memref<513xi64, strided<[1], offset: ?>>
      scf.for %arg5 = %c0 to %c3 step %c1 {
        %subview_1 = memref.subview %arg0[%arg3, %arg5, 0] [1, 1, 513] [1, 1, 1] : memref<4x3x513xi64, strided<[?, ?, ?], offset: ?>> to memref<513xi64, strided<[?], offset: ?>>
        %0 = memref.load %arg1[%arg5, %arg4] : memref<3x2xi3, strided<[?, ?], offset: ?>>
        %1 = arith.extsi %0 : i3 to i64
        %alloc_2 = memref.alloc() {alignment = 64 : i64} : memref<513xi64>
        %cast = memref.cast %alloc_2 : memref<513xi64> to memref<?xi64, #map>
        %cast_3 = memref.cast %subview_1 : memref<513xi64, strided<[?], offset: ?>> to memref<?xi64, #map>
        func.call @memref_mul_cleartext_lwe_ciphertext_u64(%cast, %cast_3, %1) : (memref<?xi64, #map>, memref<?xi64, #map>, i64) -> ()
        %alloc_4 = memref.alloc() {alignment = 64 : i64} : memref<513xi64>
        %cast_5 = memref.cast %alloc_4 : memref<513xi64> to memref<?xi64, #map>
        %cast_6 = memref.cast %subview : memref<513xi64, strided<[1], offset: ?>> to memref<?xi64, #map>
        %cast_7 = memref.cast %alloc_2 : memref<513xi64> to memref<?xi64, #map>
        func.call @memref_add_lwe_ciphertexts_u64(%cast_5, %cast_6, %cast_7) : (memref<?xi64, #map>, memref<?xi64, #map>, memref<?xi64, #map>) -> ()
        memref.dealloc %alloc_2 : memref<513xi64>
        memref.copy %alloc_4, %subview : memref<513xi64> to memref<513xi64, strided<[1], offset: ?>>
        memref.dealloc %alloc_4 : memref<513xi64>
      }
    }
  }
  call @_dfr_stop(%c0_i64) : (i64) -> ()
  return %alloc : memref<4x2x513xi64>
}
%0 = x                  # EncryptedScalar<uint3>              ∈ [0, 7]
%1 = y                  # EncryptedScalar<uint4>              ∈ [0, 15]
%2 = add(%0, %1)        # EncryptedScalar<uint5>              ∈ [2, 22]
return %2                                     ^ these are       ^^^^^^^
                                                the assigned    based on
                                                bit-widths      these bounds
D: data
N: noise

3-bit number
------------
D2 D1 D0 0 0 0 ... 0 0 0 N N N N

4-bit number
------------
D3 D2 D1 D0 0 0 0 ... 0 0 0 N N N N
5-bit number
------------
D4 D3 D2 D1 D0 0 0 0 ... 0 0 0 N N N N
%0 = x                  # EncryptedScalar<uint5>
%1 = y                  # EncryptedScalar<uint5>
%2 = add(%0, %1)        # EncryptedScalar<uint5>
return %2
%0 = x                    # EncryptedScalar<uint2>        ∈ [0, 3]
%1 = y                    # EncryptedScalar<uint5>        ∈ [0, 31]
%2 = 2                    # ClearScalar<uint2>            ∈ [2, 2]
%3 = power(%0, %2)        # EncryptedScalar<uint4>        ∈ [0, 9]
%4 = add(%3, %1)          # EncryptedScalar<uint6>        ∈ [1, 39]
return %4
%0 = x                    # EncryptedScalar<uint2>        ∈ [0, 3]
%1 = y                    # EncryptedScalar<uint6>        ∈ [0, 31]
%2 = 2                    # ClearScalar<uint2>            ∈ [2, 2]
%3 = power(%0, %2)        # EncryptedScalar<uint6>        ∈ [0, 9]
%4 = add(%3, %1)          # EncryptedScalar<uint6>        ∈ [1, 39]
return %4
import time

import matplotlib.pyplot as plt
import numpy as np
import randimage
from concrete import fhe

configuration = fhe.Configuration(
    enable_unsafe_features=True,
    use_insecure_key_cache=True,
    insecure_key_cache_location=".keys",

    # To enable displaying progressbar
    show_progress=True,
    # To enable showing tags in the progressbar (does not work in notebooks)
    progress_tag=True,
    # To give a title to the progressbar
    progress_title="Evaluation:",
)

@fhe.compiler({"image": "encrypted"})
def to_grayscale(image):
    with fhe.tag("scaling.r"):
        r = image[:, :, 0]
        r = (r * 0.30).astype(np.int64)

    with fhe.tag("scaling.g"):
        g = image[:, :, 1]
        g = (g * 0.59).astype(np.int64)

    with fhe.tag("scaling.b"):
        b = image[:, :, 2]
        b = (b * 0.11).astype(np.int64)

    with fhe.tag("combining.rgb"):
        gray = r + g + b
        
    with fhe.tag("creating.result"):
        gray = np.expand_dims(gray, axis=2)
        result = np.concatenate((gray, gray, gray), axis=2)
    
    return result

image_size = (16, 16)
image_data = (randimage.get_random_image(image_size) * 255).round().astype(np.int64)

print()

print(f"Compilation started @ {time.strftime('%H:%M:%S', time.localtime())}")
start = time.time()
inputset = [np.random.randint(0, 256, size=image_data.shape) for _ in range(100)]
circuit = to_grayscale.compile(inputset, configuration)
end = time.time()
print(f"(took {end - start:.3f} seconds)")

print()

print(f"Key generation started @ {time.strftime('%H:%M:%S', time.localtime())}")
start = time.time()
circuit.keygen()
end = time.time()
print(f"(took {end - start:.3f} seconds)")

print()

print(f"Evaluation started @ {time.strftime('%H:%M:%S', time.localtime())}")
start = time.time()
grayscale_image_data = circuit.encrypt_run_decrypt(image_data)
end = time.time()
print(f"(took {end - start:.3f} seconds)")

fig, axs = plt.subplots(1, 2)
axs = axs.flatten()

axs[0].set_title("Original")
axs[0].imshow(image_data)
axs[0].axis("off")

axs[1].set_title("Grayscale")
axs[1].imshow(grayscale_image_data)
axs[1].axis("off")

plt.show()
Evaluation:  10% |█████.............................................|  10% (scaling.r)
^^^^^^^^^^^  ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ ^^^^^^^^^^^
Title        Progressbar                                                   Tag
Evaluation:  30% |███████████████...................................|  30% (scaling.g)
Evaluation:  50% |█████████████████████████.........................|  50% (scaling.b)
from concrete import fhe

@fhe.compiler({"x": "encrypted"})
def f(x):
    return (x**2) + (2*x) + 4

inputset = range(2**2)
circuit = f.compile(inputset, show_statistics=True)
Statistics
--------------------------------------------------------------------------------
size_of_secret_keys: 22648
size_of_bootstrap_keys: 51274176
size_of_keyswitch_keys: 64092720
size_of_inputs: 16392
size_of_outputs: 16392
p_error: 9.627450598589458e-06
global_p_error: 9.627450598589458e-06
complexity: 99198195.0
programmable_bootstrap_count: 1
programmable_bootstrap_count_per_parameter: {
    BootstrapKeyParam(polynomial_size=2048, glwe_dimension=1, input_lwe_dimension=781, level=1, base_log=23, variance=9.940977002694397e-32): 1
}
key_switch_count: 1
key_switch_count_per_parameter: {
    KeyswitchKeyParam(level=5, base_log=3, variance=1.939836732335308e-11): 1
}
packing_key_switch_count: 0
clear_addition_count: 1
clear_addition_count_per_parameter: {
    LweSecretKeyParam(dimension=2048): 1
}
encrypted_addition_count: 1
encrypted_addition_count_per_parameter: {
    LweSecretKeyParam(dimension=2048): 1
}
clear_multiplication_count: 1
clear_multiplication_count_per_parameter: {
    LweSecretKeyParam(dimension=2048): 1
}
encrypted_negation_count: 0
--------------------------------------------------------------------------------
Statistics
--------------------------------------------------------------------------------
clear_multiplication_count_per_tag: {
    /model/model: 53342
    /model/model.0/Gemm: 14720
    /model/model.0/Gemm.matmul: 14720
    /model/model.2/Gemm: 11730
    /model/model.2/Gemm.matmul: 11730
    /model/model.4/Gemm: 9078
    /model/model.4/Gemm.matmul: 9078
    /model/model.6/Gemm: 6764
    /model/model.6/Gemm.matmul: 6764
    /model/model.8/Gemm: 4788
    /model/model.8/Gemm.matmul: 4788
    /model/model.10/Gemm: 3150
    /model/model.10/Gemm.matmul: 3150
    /model/model.12/Gemm: 1850
    /model/model.12/Gemm.matmul: 1850
    /model/model.14/Gemm: 888
    /model/model.14/Gemm.matmul: 888
    /model/model.16/Gemm: 264
    /model/model.16/Gemm.matmul: 264
    /model/model.18/Gemm: 110
    /model/model.18/Gemm.matmul: 110
}
encrypted_addition_count_per_tag: {
    /model/model: 53342
    /model/model.0/Gemm: 14720
    /model/model.0/Gemm.matmul: 14720
    /model/model.2/Gemm: 11730
    /model/model.2/Gemm.matmul: 11730
    /model/model.4/Gemm: 9078
    /model/model.4/Gemm.matmul: 9078
    /model/model.6/Gemm: 6764
    /model/model.6/Gemm.matmul: 6764
    /model/model.8/Gemm: 4788
    /model/model.8/Gemm.matmul: 4788
    /model/model.10/Gemm: 3150
    /model/model.10/Gemm.matmul: 3150
    /model/model.12/Gemm: 1850
    /model/model.12/Gemm.matmul: 1850
    /model/model.14/Gemm: 888
    /model/model.14/Gemm.matmul: 888
    /model/model.16/Gemm: 264
    /model/model.16/Gemm.matmul: 264
    /model/model.18/Gemm: 110
    /model/model.18/Gemm.matmul: 110
}
--------------------------------------------------------------------------------
from concrete import fhe
import numpy as np

configuration = fhe.Configuration(p_error=0.01, dataflow_parallelize=True)

@fhe.compiler({"x": "encrypted"})
def f(x):
    return x + 42

inputset = range(10)
circuit = f.compile(inputset, configuration=configuration)
from concrete import fhe
import numpy as np

@fhe.compiler({"x": "encrypted"})
def f(x):
    return x + 42

inputset = range(10)
circuit = f.compile(inputset, p_error=0.01, dataflow_parallelize=True)
from concrete import fhe
import numpy as np

configuration = fhe.Configuration(p_error=0.01)

@fhe.compiler({"x": "encrypted"})
def f(x):
    return x + 42

inputset = range(10)
circuit = f.compile(inputset, configuration=configuration, loop_parallelize=True)

Tracing dialect

Tracing dialect A dialect to print program values at runtime.

Operation definition

Tracing.trace_ciphertext (::mlir::concretelang::Tracing::TraceCiphertextOp)

Prints a ciphertext.

Attributes:

Attribute
MLIR Type
Description

msg

::mlir::StringAttr

string attribute

nmsb

::mlir::IntegerAttr

32-bit signless integer attribute

Operands:

Operand
Description

ciphertext

Tracing.trace_message (::mlir::concretelang::Tracing::TraceMessageOp)

Prints a message.

Attributes:

Attribute
MLIR Type
Description

msg

::mlir::StringAttr

string attribute

Tracing.trace_plaintext (::mlir::concretelang::Tracing::TracePlaintextOp)

Prints a plaintext.

Attributes:

Attribute
MLIR Type
Description

msg

::mlir::StringAttr

string attribute

nmsb

::mlir::IntegerAttr

32-bit signless integer attribute

Operands:

Operand
Description

plaintext

integer

TFHE dialect

High Level Fully Homomorphic Encryption dialect A dialect for representation of high level operation on fully homomorphic ciphertext.

Operation definition

TFHE.batched_add_glwe_cst_int (::mlir::concretelang::TFHE::ABatchedAddGLWECstIntOp)

Batched version of AddGLWEIntOp

Traits: AlwaysSpeculatableImplTrait

Interfaces: ConditionallySpeculatable, NoMemoryEffect (MemoryEffectOpInterface)

Effects: MemoryEffects::Effect{}

Operands:

Operand
Description

ciphertext

A GLWE ciphertext

plaintexts

1D tensor of integer values

Results:

Result
Description

result

1D tensor of A GLWE ciphertext values

TFHE.batched_add_glwe_int_cst (::mlir::concretelang::TFHE::ABatchedAddGLWEIntCstOp)

Batched version of AddGLWEIntOp

Traits: AlwaysSpeculatableImplTrait

Interfaces: ConditionallySpeculatable, NoMemoryEffect (MemoryEffectOpInterface)

Effects: MemoryEffects::Effect{}

Operands:

Operand
Description

ciphertexts

1D tensor of A GLWE ciphertext values

plaintext

integer

Results:

Result
Description

result

1D tensor of A GLWE ciphertext values

TFHE.batched_add_glwe_int (::mlir::concretelang::TFHE::ABatchedAddGLWEIntOp)

Batched version of AddGLWEIntOp

Traits: AlwaysSpeculatableImplTrait

Interfaces: ConditionallySpeculatable, NoMemoryEffect (MemoryEffectOpInterface)

Effects: MemoryEffects::Effect{}

Operands:

Operand
Description

ciphertexts

1D tensor of A GLWE ciphertext values

plaintexts

1D tensor of integer values

Results:

Result
Description

result

1D tensor of A GLWE ciphertext values

TFHE.batched_add_glwe (::mlir::concretelang::TFHE::ABatchedAddGLWEOp)

Batched version of AddGLWEOp

Traits: AlwaysSpeculatableImplTrait

Interfaces: ConditionallySpeculatable, NoMemoryEffect (MemoryEffectOpInterface)

Effects: MemoryEffects::Effect{}

Operands:

Operand
Description

ciphertexts_a

1D tensor of A GLWE ciphertext values

ciphertexts_b

1D tensor of A GLWE ciphertext values

Results:

Result
Description

result

1D tensor of A GLWE ciphertext values

TFHE.add_glwe_int (::mlir::concretelang::TFHE::AddGLWEIntOp)

Returns the sum of a clear integer and an lwe ciphertext

Traits: AlwaysSpeculatableImplTrait

Interfaces: BatchableOpInterface, ConditionallySpeculatable, NoMemoryEffect (MemoryEffectOpInterface)

Effects: MemoryEffects::Effect{}

Operands:

Operand
Description

a

A GLWE ciphertext

b

integer

Results:

Result
Description

«unnamed»

A GLWE ciphertext

TFHE.add_glwe (::mlir::concretelang::TFHE::AddGLWEOp)

Returns the sum of two lwe ciphertexts

Traits: AlwaysSpeculatableImplTrait

Interfaces: BatchableOpInterface, ConditionallySpeculatable, NoMemoryEffect (MemoryEffectOpInterface)

Effects: MemoryEffects::Effect{}

Operands:

Operand
Description

a

A GLWE ciphertext

b

A GLWE ciphertext

Results:

Result
Description

«unnamed»

A GLWE ciphertext

TFHE.batched_bootstrap_glwe (::mlir::concretelang::TFHE::BatchedBootstrapGLWEOp)

Batched version of KeySwitchGLWEOp

Traits: AlwaysSpeculatableImplTrait

Interfaces: ConditionallySpeculatable, NoMemoryEffect (MemoryEffectOpInterface)

Effects: MemoryEffects::Effect{}

Attributes:

Attribute
MLIR Type
Description

key

::mlir::concretelang::TFHE::GLWEBootstrapKeyAttr

An attribute representing bootstrap key.

Operands:

Operand
Description

ciphertexts

1D tensor of A GLWE ciphertext values

lookup_table

1D tensor of 64-bit signless integer values

Results:

Result
Description

result

1D tensor of A GLWE ciphertext values

TFHE.batched_keyswitch_glwe (::mlir::concretelang::TFHE::BatchedKeySwitchGLWEOp)

Batched version of KeySwitchGLWEOp

Traits: AlwaysSpeculatableImplTrait

Interfaces: ConditionallySpeculatable, NoMemoryEffect (MemoryEffectOpInterface)

Effects: MemoryEffects::Effect{}

Attributes:

Attribute
MLIR Type
Description

key

::mlir::concretelang::TFHE::GLWEKeyswitchKeyAttr

An attribute representing keyswitch key.

Operands:

Operand
Description

ciphertexts

1D tensor of A GLWE ciphertext values

Results:

Result
Description

result

1D tensor of A GLWE ciphertext values

TFHE.batched_mapped_bootstrap_glwe (::mlir::concretelang::TFHE::BatchedMappedBootstrapGLWEOp)

Batched version of KeySwitchGLWEOp which also batches the lookup table

Traits: AlwaysSpeculatableImplTrait

Interfaces: ConditionallySpeculatable, NoMemoryEffect (MemoryEffectOpInterface)

Effects: MemoryEffects::Effect{}

Attributes:

Attribute
MLIR Type
Description

key

::mlir::concretelang::TFHE::GLWEBootstrapKeyAttr

An attribute representing bootstrap key.

Operands:

Operand
Description

ciphertexts

1D tensor of A GLWE ciphertext values

lookup_table

2D tensor of 64-bit signless integer values

Results:

Result
Description

result

1D tensor of A GLWE ciphertext values

TFHE.batched_mul_glwe_cst_int (::mlir::concretelang::TFHE::BatchedMulGLWECstIntOp)

Batched version of MulGLWECstIntOp

Traits: AlwaysSpeculatableImplTrait

Interfaces: ConditionallySpeculatable, NoMemoryEffect (MemoryEffectOpInterface)

Effects: MemoryEffects::Effect{}

Operands:

Operand
Description

ciphertext

A GLWE ciphertext

cleartexts

1D tensor of integer values

Results:

Result
Description

result

1D tensor of A GLWE ciphertext values

TFHE.batched_mul_glwe_int_cst (::mlir::concretelang::TFHE::BatchedMulGLWEIntCstOp)

Batched version of MulGLWEIntCstOp

Traits: AlwaysSpeculatableImplTrait

Interfaces: ConditionallySpeculatable, NoMemoryEffect (MemoryEffectOpInterface)

Effects: MemoryEffects::Effect{}

Operands:

Operand
Description

ciphertexts

1D tensor of A GLWE ciphertext values

cleartext

integer

Results:

Result
Description

result

1D tensor of A GLWE ciphertext values

TFHE.batched_mul_glwe_int (::mlir::concretelang::TFHE::BatchedMulGLWEIntOp)

Batched version of MulGLWEIntOp

Traits: AlwaysSpeculatableImplTrait

Interfaces: ConditionallySpeculatable, NoMemoryEffect (MemoryEffectOpInterface)

Effects: MemoryEffects::Effect{}

Operands:

Operand
Description

ciphertexts

1D tensor of A GLWE ciphertext values

cleartexts

1D tensor of integer values

Results:

Result
Description

result

1D tensor of A GLWE ciphertext values

TFHE.batched_neg_glwe (::mlir::concretelang::TFHE::BatchedNegGLWEOp)

Batched version of NegGLWEOp

Traits: AlwaysSpeculatableImplTrait

Interfaces: ConditionallySpeculatable, NoMemoryEffect (MemoryEffectOpInterface)

Effects: MemoryEffects::Effect{}

Operands:

Operand
Description

ciphertexts

1D tensor of A GLWE ciphertext values

Results:

Result
Description

result

1D tensor of A GLWE ciphertext values

TFHE.bootstrap_glwe (::mlir::concretelang::TFHE::BootstrapGLWEOp)

Programmable bootstraping of a GLWE ciphertext with a lookup table

Traits: AlwaysSpeculatableImplTrait

Interfaces: BatchableOpInterface, ConditionallySpeculatable, NoMemoryEffect (MemoryEffectOpInterface)

Effects: MemoryEffects::Effect{}

Attributes:

Attribute
MLIR Type
Description

key

::mlir::concretelang::TFHE::GLWEBootstrapKeyAttr

An attribute representing bootstrap key.

Operands:

Operand
Description

ciphertext

A GLWE ciphertext

lookup_table

1D tensor of 64-bit signless integer values

Results:

Result
Description

result

A GLWE ciphertext

TFHE.encode_expand_lut_for_bootstrap (::mlir::concretelang::TFHE::EncodeExpandLutForBootstrapOp)

Encode and expand a lookup table so that it can be used for a bootstrap.

Traits: AlwaysSpeculatableImplTrait

Interfaces: ConditionallySpeculatable, NoMemoryEffect (MemoryEffectOpInterface)

Effects: MemoryEffects::Effect{}

Attributes:

Attribute
MLIR Type
Description

polySize

::mlir::IntegerAttr

32-bit signless integer attribute

outputBits

::mlir::IntegerAttr

32-bit signless integer attribute

isSigned

::mlir::BoolAttr

bool attribute

Operands:

Operand
Description

input_lookup_table

1D tensor of 64-bit signless integer values

Results:

Result
Description

result

1D tensor of 64-bit signless integer values

TFHE.encode_lut_for_crt_woppbs (::mlir::concretelang::TFHE::EncodeLutForCrtWopPBSOp)

Encode and expand a lookup table so that it can be used for a wop pbs.

Traits: AlwaysSpeculatableImplTrait

Interfaces: ConditionallySpeculatable, NoMemoryEffect (MemoryEffectOpInterface)

Effects: MemoryEffects::Effect{}

Attributes:

Attribute
MLIR Type
Description

crtDecomposition

::mlir::ArrayAttr

64-bit integer array attribute

crtBits

::mlir::ArrayAttr

64-bit integer array attribute

modulusProduct

::mlir::IntegerAttr

32-bit signless integer attribute

isSigned

::mlir::BoolAttr

bool attribute

Operands:

Operand
Description

input_lookup_table

1D tensor of 64-bit signless integer values

Results:

Result
Description

result

2D tensor of 64-bit signless integer values

TFHE.encode_plaintext_with_crt (::mlir::concretelang::TFHE::EncodePlaintextWithCrtOp)

Encodes a plaintext by decomposing it on a crt basis.

Traits: AlwaysSpeculatableImplTrait

Interfaces: ConditionallySpeculatable, NoMemoryEffect (MemoryEffectOpInterface)

Effects: MemoryEffects::Effect{}

Attributes:

Attribute
MLIR Type
Description

mods

::mlir::ArrayAttr

64-bit integer array attribute

modsProd

::mlir::IntegerAttr

64-bit signless integer attribute

Operands:

Operand
Description

input

64-bit signless integer

Results:

Result
Description

result

1D tensor of 64-bit signless integer values

TFHE.keyswitch_glwe (::mlir::concretelang::TFHE::KeySwitchGLWEOp)

Change the encryption parameters of a glwe ciphertext by applying a keyswitch

Traits: AlwaysSpeculatableImplTrait

Interfaces: BatchableOpInterface, ConditionallySpeculatable, NoMemoryEffect (MemoryEffectOpInterface)

Effects: MemoryEffects::Effect{}

Attributes:

Attribute
MLIR Type
Description

key

::mlir::concretelang::TFHE::GLWEKeyswitchKeyAttr

An attribute representing keyswitch key.

Operands:

Operand
Description

ciphertext

A GLWE ciphertext

Results:

Result
Description

result

A GLWE ciphertext

TFHE.mul_glwe_int (::mlir::concretelang::TFHE::MulGLWEIntOp)

Returns the product of a clear integer and an lwe ciphertext

Traits: AlwaysSpeculatableImplTrait

Interfaces: BatchableOpInterface, ConditionallySpeculatable, NoMemoryEffect (MemoryEffectOpInterface)

Effects: MemoryEffects::Effect{}

Operands:

Operand
Description

a

A GLWE ciphertext

b

integer

Results:

Result
Description

«unnamed»

A GLWE ciphertext

TFHE.neg_glwe (::mlir::concretelang::TFHE::NegGLWEOp)

Negates a glwe ciphertext

Traits: AlwaysSpeculatableImplTrait

Interfaces: BatchableOpInterface, ConditionallySpeculatable, NoMemoryEffect (MemoryEffectOpInterface)

Effects: MemoryEffects::Effect{}

Operands:

Operand
Description

a

A GLWE ciphertext

Results:

Result
Description

«unnamed»

A GLWE ciphertext

TFHE.sub_int_glwe (::mlir::concretelang::TFHE::SubGLWEIntOp)

Substracts an integer and a GLWE ciphertext

Traits: AlwaysSpeculatableImplTrait

Interfaces: ConditionallySpeculatable, NoMemoryEffect (MemoryEffectOpInterface)

Effects: MemoryEffects::Effect{}

Operands:

Operand
Description

a

integer

b

A GLWE ciphertext

Results:

Result
Description

«unnamed»

A GLWE ciphertext

TFHE.wop_pbs_glwe (::mlir::concretelang::TFHE::WopPBSGLWEOp)

Traits: AlwaysSpeculatableImplTrait

Interfaces: ConditionallySpeculatable, NoMemoryEffect (MemoryEffectOpInterface)

Effects: MemoryEffects::Effect{}

Attributes:

Attribute
MLIR Type
Description

ksk

::mlir::concretelang::TFHE::GLWEKeyswitchKeyAttr

An attribute representing keyswitch key.

bsk

::mlir::concretelang::TFHE::GLWEBootstrapKeyAttr

An attribute representing bootstrap key.

pksk

::mlir::concretelang::TFHE::GLWEPackingKeyswitchKeyAttr

An attribute representing Wop Pbs key.

crtDecomposition

::mlir::ArrayAttr

64-bit integer array attribute

cbsLevels

::mlir::IntegerAttr

32-bit signless integer attribute

cbsBaseLog

::mlir::IntegerAttr

32-bit signless integer attribute

Operands:

Operand
Description

ciphertexts

lookupTable

2D tensor of 64-bit signless integer values

Results:

Result
Description

result

TFHE.zero (::mlir::concretelang::TFHE::ZeroGLWEOp)

Returns a trivial encryption of 0

Traits: AlwaysSpeculatableImplTrait

Interfaces: ConditionallySpeculatable, NoMemoryEffect (MemoryEffectOpInterface)

Effects: MemoryEffects::Effect{}

Results:

Result
Description

out

A GLWE ciphertext

TFHE.zero_tensor (::mlir::concretelang::TFHE::ZeroTensorGLWEOp)

Returns a tensor containing trivial encryptions of 0

Traits: AlwaysSpeculatableImplTrait

Interfaces: ConditionallySpeculatable, NoMemoryEffect (MemoryEffectOpInterface)

Effects: MemoryEffects::Effect{}

Results:

Result
Description

tensor

Attribute definition

GLWEBootstrapKeyAttr

An attribute representing bootstrap key.

Syntax:

#TFHE.bsk<
  mlir::concretelang::TFHE::GLWESecretKey,   # inputKey
  mlir::concretelang::TFHE::GLWESecretKey,   # outputKey
  int,   # polySize
  int,   # glweDim
  int,   # levels
  int,   # baseLog
  int   # index
>

Parameters:

Parameter
C++ type
Description

inputKey

mlir::concretelang::TFHE::GLWESecretKey

outputKey

mlir::concretelang::TFHE::GLWESecretKey

polySize

int

glweDim

int

levels

int

baseLog

int

index

int

GLWEKeyswitchKeyAttr

An attribute representing keyswitch key.

Syntax:

#TFHE.ksk<
  mlir::concretelang::TFHE::GLWESecretKey,   # inputKey
  mlir::concretelang::TFHE::GLWESecretKey,   # outputKey
  int,   # levels
  int,   # baseLog
  int   # index
>

Parameters:

Parameter
C++ type
Description

inputKey

mlir::concretelang::TFHE::GLWESecretKey

outputKey

mlir::concretelang::TFHE::GLWESecretKey

levels

int

baseLog

int

index

int

GLWEPackingKeyswitchKeyAttr

An attribute representing Wop Pbs key.

Syntax:

#TFHE.pksk<
  mlir::concretelang::TFHE::GLWESecretKey,   # inputKey
  mlir::concretelang::TFHE::GLWESecretKey,   # outputKey
  int,   # outputPolySize
  int,   # innerLweDim
  int,   # glweDim
  int,   # levels
  int,   # baseLog
  int   # index
>

Parameters:

Parameter
C++ type
Description

inputKey

mlir::concretelang::TFHE::GLWESecretKey

outputKey

mlir::concretelang::TFHE::GLWESecretKey

outputPolySize

int

innerLweDim

int

glweDim

int

levels

int

baseLog

int

index

int

Type definition

GLWECipherTextType

A GLWE ciphertext

An GLWE cipher text

Parameters:

Parameter
C++ type
Description

key

mlir::concretelang::TFHE::GLWESecretKey

FHELinalg dialect

High Level Fully Homomorphic Encryption Linalg dialect A dialect for representation of high level linalg operations on fully homomorphic ciphertexts.

Operation definition

FHELinalg.add_eint_int (::mlir::concretelang::FHELinalg::AddEintIntOp)

Returns a tensor that contains the addition of a tensor of encrypted integers and a tensor of clear integers.

Performs an addition following the broadcasting rules between a tensor of encrypted integers and a tensor of clear integers. The width of the clear integers must be less than or equal to the width of encrypted integers.

Examples:

// Returns the term-by-term addition of `%a0` with `%a1`
"FHELinalg.add_eint_int"(%a0, %a1) : (tensor<4x!FHE.eint<4>>, tensor<4xi5>) -> tensor<4x!FHE.eint<4>>

// Returns the term-by-term addition of `%a0` with `%a1`, where dimensions equal to one are stretched.
"FHELinalg.add_eint_int"(%a0, %a1) : (tensor<4x1x4x!FHE.eint<4>>, tensor<1x4x4xi5>) -> tensor<4x4x4x!FHE.eint<4>>

// Returns the addition of a 3x3 matrix of encrypted integers and a 3x1 matrix (a column) of integers.
//
// [1,2,3]   [1]   [2,3,4]
// [4,5,6] + [2] = [6,7,8]
// [7,8,9]   [3]   [10,11,12]
//
// The dimension #1 of operand #2 is stretched as it is equal to 1.
"FHELinalg.add_eint_int"(%a0, %a1) : (tensor<3x3x!FHE.eint<4>>, tensor<3x1xi5>) -> tensor<3x3x!FHE.eint<4>>

// Returns the addition of a 3x3 matrix of encrypted integers and a 1x3 matrix (a line) of integers.
//
// [1,2,3]             [2,4,6]
// [4,5,6] + [1,2,3] = [5,7,9]
// [7,8,9]             [8,10,12]
//
// The dimension #2 of operand #2 is stretched as it is equal to 1.
"FHELinalg.add_eint_int"(%a0, %a1) : (tensor<3x3x!FHE.eint<4>>, tensor<1x3xi5>) -> tensor<3x3x!FHE.eint<4>>

// Same behavior as the previous one, but as the dimension #2 is missing of operand #2.
"FHELinalg.add_eint_int(%a0, %a1)" : (tensor<3x4x!FHE.eint<4>>, tensor<3xi5>) -> tensor<4x4x4x!FHE.eint<4>>

Traits: AlwaysSpeculatableImplTrait, TensorBinaryEintInt, TensorBroadcastingRules

Interfaces: Binary, BinaryEintInt, ConditionallySpeculatable, NoMemoryEffect (MemoryEffectOpInterface)

Effects: MemoryEffects::Effect{}

Operands:

Operand
Description

lhs

rhs

Results:

Result
Description

«unnamed»

FHELinalg.add_eint (::mlir::concretelang::FHELinalg::AddEintOp)

Returns a tensor that contains the addition of two tensor of encrypted integers.

Performs an addition following the broadcasting rules between two tensors of encrypted integers. The width of the encrypted integers must be equal.

Examples:

// Returns the term-by-term addition of `%a0` with `%a1`
"FHELinalg.add_eint"(%a0, %a1) : (tensor<4x!FHE.eint<4>>, tensor<4x!FHE.eint<4>>) -> tensor<4x!FHE.eint<4>>

// Returns the term-by-term addition of `%a0` with `%a1`, where dimensions equal to one are stretched.
"FHELinalg.add_eint"(%a0, %a1) : (tensor<4x1x4x!FHE.eint<4>>, tensor<1x4x4x!FHE.eint<4>>) -> tensor<4x4x4x!FHE.eint<4>>

// Returns the addition of a 3x3 matrix of encrypted integers and a 3x1 matrix (a column) of encrypted integers.
//
// [1,2,3]   [1]   [2,3,4]
// [4,5,6] + [2] = [6,7,8]
// [7,8,9]   [3]   [10,11,12]
//
// The dimension #1 of operand #2 is stretched as it is equal to 1.
"FHELinalg.add_eint"(%a0, %a1) : (tensor<3x3x!FHE.eint<4>>, tensor<3x1x!FHE.eint<4>>) -> tensor<3x3x!FHE.eint<4>>

// Returns the addition of a 3x3 matrix of encrypted integers and a 1x3 matrix (a line) of encrypted integers.
//
// [1,2,3]             [2,4,6]
// [4,5,6] + [1,2,3] = [5,7,9]
// [7,8,9]             [8,10,12]
//
// The dimension #2 of operand #2 is stretched as it is equal to 1.
"FHELinalg.add_eint"(%a0, %a1) : (tensor<3x3x!FHE.eint<4>>, tensor<1x3x!FHE.eint<4>>) -> tensor<3x3x!FHE.eint<4>>

// Same behavior as the previous one, but as the dimension #2 of operand #2 is missing.
"FHELinalg.add_eint"(%a0, %a1) : (tensor<3x3x!FHE.eint<4>>, tensor<3x!FHE.eint<4>>) -> tensor<3x3x!FHE.eint<4>>

Traits: AlwaysSpeculatableImplTrait, TensorBinaryEint, TensorBroadcastingRules

Interfaces: BinaryEint, ConditionallySpeculatable, NoMemoryEffect (MemoryEffectOpInterface)

Effects: MemoryEffects::Effect{}

Operands:

Operand
Description

lhs

rhs

Results:

Result
Description

«unnamed»

FHELinalg.apply_lookup_table (::mlir::concretelang::FHELinalg::ApplyLookupTableEintOp)

Returns a tensor that contains the result of the lookup on a table.

For each encrypted index, performs a lookup table of clear integers.

// The result of this operation, is a tensor that contains the result of a lookup table.
// i.e. %res[i, ..., k] = %lut[%t[i, ..., k]]
%res = FHELinalg.apply_lookup_table(%t, %lut): tensor<DNx...xD1x!FHE.eint<$p>>, tensor<D2^$pxi64> -> tensor<DNx...xD1x!FHE.eint<$p>>

The %lut argument must be a tensor with one dimension, where its dimension is 2^p where p is the width of the encrypted integers.

Examples:


// Returns the lookup of 3x3 matrix of encrypted indices of with 2 on a table of size 4=2² of clear integers.
//
// [0,1,2]                 [1,3,5]
// [3,0,1] lut [1,3,5,7] = [7,1,3]
// [2,3,0]                 [5,7,1]
"FHELinalg.apply_lookup_table"(%t, %lut) : (tensor<3x3x!FHE.eint<2>>, tensor<4xi64>) -> tensor<3x3x!FHE.eint<3>>

Traits: AlwaysSpeculatableImplTrait

Interfaces: ConditionallySpeculatable, ConstantNoise, NoMemoryEffect (MemoryEffectOpInterface)

Effects: MemoryEffects::Effect{}

Operands:

Operand
Description

t

lut

Results:

Result
Description

«unnamed»

FHELinalg.apply_mapped_lookup_table (::mlir::concretelang::FHELinalg::ApplyMappedLookupTableEintOp)

Returns a tensor that contains the result of the lookup on a table, using a different lookup table for each element, specified by a map.

Performs for each encrypted index a lookup table of clear integers. Multiple lookup tables are passed, and the application of lookup tables is performed following the broadcasting rules. The precise lookup is specified by a map.

// The result of this operation, is a tensor that contains the result of the lookup on different tables.
// i.e. %res[i, ..., k] = %luts[ %map[i, ..., k] ][ %t[i, ..., k] ]
%res = FHELinalg.apply_mapped_lookup_table(%t, %luts, %map): tensor<DNx...xD1x!FHE.eint<$p>>, tensor<DM x ^$p>, tensor<DNx...xD1xindex> -> tensor<DNx...xD1x!FHE.eint<$p>>

Examples:


// Returns the lookup of 3x2 matrix of encrypted indices of width 2 on a vector of 2 tables of size 4=2^2 of clear integers.
//
// [0,1]                                 [0, 1] = [1,2]
// [3,0] lut [[1,3,5,7], [0,2,4,6]] with [0, 1] = [7,0]
// [2,3]                                 [0, 1] = [5,6]
"FHELinalg.apply_mapped_lookup_table"(%t, %luts, %map) : (tensor<3x2x!FHE.eint<2>>, tensor<2x4xi64>, tensor<3x2xindex>) -> tensor<3x2x!FHE.eint<3>>

Others examples: // [0,1] [1, 0] = [3,2] // [3,0] lut [[1,3,5,7], [0,2,4,6]] with [0, 1] = [7,0] // [2,3] [1, 0] = [4,7]

// [0,1] [0, 0] = [1,3] // [3,0] lut [[1,3,5,7], [0,2,4,6]] with [1, 1] = [6,0] // [2,3] [1, 0] = [4,7]

// [0,1] [0] = [1,3] // [3,0] lut [[1,3,5,7], [0,2,4,6]] with [1] = [6,0] // [2,3] [0] = [5,7]

// [0,1] = [1,2] // [3,0] lut [[1,3,5,7], [0,2,4,6]] with [0, 1] = [7,0] // [2,3] = [5,6]

Traits: AlwaysSpeculatableImplTrait

Interfaces: ConditionallySpeculatable, ConstantNoise, NoMemoryEffect (MemoryEffectOpInterface)

Effects: MemoryEffects::Effect{}

Operands:

Operand
Description

t

luts

map

Results:

Result
Description

«unnamed»

FHELinalg.apply_multi_lookup_table (::mlir::concretelang::FHELinalg::ApplyMultiLookupTableEintOp)

Returns a tensor that contains the result of the lookup on a table, using a different lookup table for each element.

Performs for each encrypted index a lookup table of clear integers. Multiple lookup tables are passed, and the application of lookup tables is performed following the broadcasting rules.

// The result of this operation, is a tensor that contains the result of the lookup on different tables.
// i.e. %res[i, ..., k] = [ %luts[i][%t[i]], ..., %luts[k][%t[k]] ]
%res = FHELinalg.apply_multi_lookup_table(%t, %lut): tensor<DNx...xD1x!FHE.eint<$p>>, tensor<DMx...xD1xD2^$pxi64> -> tensor<DNx...xD1x!FHE.eint<$p>>

The %luts argument should be a tensor with M dimension, where the first M-1 dimensions are broadcastable with the N dimensions of the encrypted tensor, and where the last dimension dimension is equal to 2^p where p is the width of the encrypted integers.

Examples:


// Returns the lookup of 3x2 matrix of encrypted indices of width 2 on a vector of 2 tables of size 4=2² of clear integers.
// The tables are broadcasted along the first dimension of the tensor.
//
// [0,1]                            = [1,2]
// [3,0] lut [[1,3,5,7], [0,2,4,6]] = [7,0]
// [2,3]                            = [5,6]
"FHELinalg.apply_multi_lookup_table"(%t, %luts) : (tensor<3x2x!FHE.eint<2>>, tensor<2x4xi64>) -> tensor<3x2x!FHE.eint<3>>

// Returns the lookup of a vector of 3 encrypted indices of width 2 on a vector of 3 tables of size 4=2² of clear integers.
//
// [3,0,1] lut [[1,3,5,7], [0,2,4,6], [1,2,3,4]] = [7,0,2]
"FHELinalg.apply_multi_lookup_table"(%t, %luts) : (tensor<3x!FHE.eint<2>>, tensor<3x4xi64>) -> tensor<3x!FHE.eint<3>>

Traits: AlwaysSpeculatableImplTrait

Interfaces: ConditionallySpeculatable, ConstantNoise, NoMemoryEffect (MemoryEffectOpInterface)

Effects: MemoryEffects::Effect{}

Operands:

Operand
Description

t

luts

Results:

Result
Description

«unnamed»

FHELinalg.broadcast (::mlir::concretelang::FHELinalg::BroadcastOp)

Broadcasts a tensor to a shape.

Broadcasting is used for expanding certain dimensions of a tensor or adding new dimensions to it at the beginning.

An example could be broadcasting a tensor with shape <1x2x1x4x1> to a tensor of shape <6x1x2x3x4x5>.

In this example:

  • last dimension of the input (1) is expanded to (5)

  • the dimension before that (4) is kept

  • the dimension before that (1) is expanded to (3)

  • the dimension before that (2) is kept

  • the dimension before that (1) is kept

  • a new dimension (6) is added to the beginning

See https://numpy.org/doc/stable/user/basics.broadcasting.html#general-broadcasting-rules for the semantics of broadcasting.

Examples:

"FHELinalg.broadcast"(%t) : (tensor<1xindex>) -> tensor<3xindex>
//
// broadcast([5]) = [5, 5, 5]
//
"FHELinalg.broadcast"(%t) : (tensor<1xindex>) -> tensor<3x2xindex>
//
// broadcast([5]) = [[5, 5], [5, 5], [5, 5]]
//
"FHELinalg.broadcast"(%t) : (tensor<2xindex>) -> tensor<3x2xindex>
//
// broadcast([2, 6]) = [[2, 6], [2, 6], [2, 6]]
//
"FHELinalg.broadcast"(%t) : (tensor<3x1xindex>) -> tensor<3x2xindex>
//
// broadcast([[1], [2], [3]]) = [[1, 1], [2, 2], [3, 3]]
//
"FHELinalg.broadcast"(%t) : (tensor<2xindex>) -> tensor<2x3x2xindex>
//
// broadcast([2, 6]) = [[[2, 6], [2, 6], [2, 6]], [[2, 6], [2, 6], [2, 6]]]
//
"FHELinalg.broadcast"(%t) : (tensor<3x1xindex>) -> tensor<2x3x2xindex>
//
// broadcast([[1], [2], [3]]) = [[[1, 1], [2, 2], [3, 3]], [[1, 1], [2, 2], [3, 3]]]
//

Traits: AlwaysSpeculatableImplTrait

Interfaces: ConditionallySpeculatable, ConstantNoise, NoMemoryEffect (MemoryEffectOpInterface)

Effects: MemoryEffects::Effect{}

Operands:

Operand
Description

input

Results:

Result
Description

output

FHELinalg.concat (::mlir::concretelang::FHELinalg::ConcatOp)

Concatenates a sequence of tensors along an existing axis.

Concatenates several tensors along a given axis.

Examples:

"FHELinalg.concat"(%a, %b) { axis = 0 } : (tensor<3x3x!FHE.eint<4>>, tensor<3x3x!FHE.eint<4>>) -> tensor<6x3x!FHE.eint<4>>
//
//        ( [1,2,3]  [1,2,3] )   [1,2,3]
// concat ( [4,5,6], [4,5,6] ) = [4,5,6]
//        ( [7,8,9]  [7,8,9] )   [7,8,9]
//                               [1,2,3]
//                               [4,5,6]
//                               [7,8,9]
//
"FHELinalg.concat"(%a, %b) { axis = 1 } : (tensor<3x3x!FHE.eint<4>>, tensor<3x3x!FHE.eint<4>>) -> tensor<3x6x!FHE.eint<4>>
//
//        ( [1,2,3]  [1,2,3] )   [1,2,3,1,2,3]
// concat ( [4,5,6], [4,5,6] ) = [4,5,6,4,5,6]
//        ( [7,8,9]  [7,8,9] )   [7,8,9,7,8,9]
//

Traits: AlwaysSpeculatableImplTrait

Interfaces: ConditionallySpeculatable, NoMemoryEffect (MemoryEffectOpInterface)

Effects: MemoryEffects::Effect{}

Attributes:

Attribute
MLIR Type
Description

axis

::mlir::IntegerAttr

64-bit signless integer attribute

Operands:

Operand
Description

ins

Results:

Result
Description

out

FHELinalg.conv2d (::mlir::concretelang::FHELinalg::Conv2dOp)

Returns the 2D convolution of a tensor in the form NCHW with weights in the form FCHW

Traits: AlwaysSpeculatableImplTrait

Interfaces: Binary, BinaryEintInt, ConditionallySpeculatable, NoMemoryEffect (MemoryEffectOpInterface)

Effects: MemoryEffects::Effect{}

Attributes:

Attribute
MLIR Type
Description

padding

::mlir::DenseIntElementsAttr

64-bit signless integer elements attribute

strides

::mlir::DenseIntElementsAttr

64-bit signless integer elements attribute

dilations

::mlir::DenseIntElementsAttr

64-bit signless integer elements attribute

group

::mlir::IntegerAttr

64-bit signless integer attribute

Operands:

Operand
Description

input

weight

bias

Results:

Result
Description

«unnamed»

FHELinalg.dot_eint_int (::mlir::concretelang::FHELinalg::Dot)

Returns the encrypted dot product between a vector of encrypted integers and a vector of clean integers.

Performs a dot product between a vector of encrypted integers and a vector of clear integers.

Examples:

// Returns the dot product of `%a0` with `%a1`
"FHELinalg.dot_eint_int"(%a0, %a1) : (tensor<4x!FHE.eint<4>>, tensor<4xi5>) -> !FHE.eint<4>

Traits: AlwaysSpeculatableImplTrait

Interfaces: Binary, BinaryEintInt, ConditionallySpeculatable, NoMemoryEffect (MemoryEffectOpInterface)

Effects: MemoryEffects::Effect{}

Operands:

Operand
Description

lhs

rhs

Results:

Result
Description

out

FHELinalg.dot_eint_eint (::mlir::concretelang::FHELinalg::DotEint)

Returns the encrypted dot product between two vectors of encrypted integers.

Performs a dot product between two vectors of encrypted integers.

Examples:

// Returns the dot product of `%a0` with `%a1`
"FHELinalg.dot_eint_eint"(%a0, %a1) : (tensor<4x!FHE.eint<4>>, tensor<4x!FHE.eint<4>>) -> !FHE.eint<4>

Traits: AlwaysSpeculatableImplTrait

Interfaces: ConditionallySpeculatable, NoMemoryEffect (MemoryEffectOpInterface)

Effects: MemoryEffects::Effect{}

Operands:

Operand
Description

lhs

rhs

Results:

Result
Description

out

FHELinalg.fancy_assign (::mlir::concretelang::FHELinalg::FancyAssignOp)

Assigns a tensor into another tensor at a tensor of indices.

Examples:

"FHELinalg.fancy_assign"(%t, %i, %a) : (tensor<5x!FHE.eint<16>>, tensor<3xindex>, tensor<3x!FHE.eint<16>>) -> tensor<5x!FHE.eint<16>>
//
// fancy_assign([10, 20, 30, 40, 50], [3, 1, 2], [1000, 2000, 3000]) = [10, 2000, 3000, 1000, 50]
//
"FHELinalg.fancy_assign"(%t, %i, %a) : (tensor<5x!FHE.eint<16>>, tensor<2x2xindex>, tensor<2x2x!FHE.eint<16>>) -> tensor<5x!FHE.eint<16>>
//
// fancy_assign([10, 20, 30, 40, 50], [[3, 1], [2, 0]], [[1000, 2000], [3000, 4000]]) = [4000, 2000, 3000, 1000, 50]
//
"FHELinalg.fancy_assign"(%t, %i, %a) : (tensor<2x3x!FHE.eint<16>>, tensor<3x2xindex>, tensor<3x!FHE.eint<16>>) -> tensor<2x3x!FHE.eint<16>>
//
// fancy_assign([[11, 12, 13], [21, 22, 23]], [[1, 0], [0, 2], [0, 0]], [1000, 2000, 3000]) = [[3000, 2000, 13], [1000, 22, 23]]
//
"FHELinalg.fancy_assign"(%t, %i, %a) : (tensor<3x3x!FHE.eint<16>>, tensor<2x3x2xindex>, tensor<2x3x!FHE.eint<16>>) -> tensor<3x3x!FHE.eint<16>>
//
// fancy_assign(
//     [[11, 12, 13], [21, 22, 23], [31, 32, 33]],
//     [[[1, 0], [0, 2], [0, 0]], [[2, 0], [1, 1], [2, 1]]],
//     [[1000, 2000, 3000], [4000, 5000, 6000]]
// ) = [[3000, 2000, 13], [1000, 5000, 23], [4000, 6000, 33]]
//

Notes:

  • Assigning to the same output position results in undefined behavior.

Traits: AlwaysSpeculatableImplTrait

Interfaces: ConditionallySpeculatable, NoMemoryEffect (MemoryEffectOpInterface)

Effects: MemoryEffects::Effect{}

Operands:

Operand
Description

input

indices

values

Results:

Result
Description

output

FHELinalg.fancy_index (::mlir::concretelang::FHELinalg::FancyIndexOp)

Index into a tensor using a tensor of indices.

Examples:

"FHELinalg.fancy_index"(%t, %i) : (tensor<5x!FHE.eint<6>>, tensor<3xindex>) -> tensor<3x!FHE.eint<6>>
//
// fancy_index([10, 20, 30, 40, 50], [3, 1, 2]) = [40, 20, 30]
//
"FHELinalg.fancy_index"(%t, %i) : (tensor<5x!FHE.eint<6>>, tensor<3x2xindex>) -> tensor<3x2x!FHE.eint<6>>
//
// fancy_index([10, 20, 30, 40, 50], [[3, 1], [2, 2], [0, 4]]) = [[40, 20], [30, 30], [10, 50]]
//
"FHELinalg.fancy_index"(%t, %i) : (tensor<2x3x!FHE.eint<6>>, tensor<3x2xindex>) -> tensor<3x!FHE.eint<6>>
//
// fancy_index([[11, 12, 13], [21, 22, 23]], [[1, 0], [0, 2], [0, 0]]) = [21, 13, 11]
//
"FHELinalg.fancy_index"(%t, %i) : (tensor<3x3x!FHE.eint<6>>, tensor<2x3x2xindex>) -> tensor<2x3x!FHE.eint<6>>
//
// fancy_index([[11, 12, 13], [21, 22, 23], [31, 32, 33]], [[[1, 0], [0, 2], [0, 0]], [[2, 0], [1, 1], [2, 1]]]) = [[21, 13, 11], [31, 22, 32]]
//

Traits: AlwaysSpeculatableImplTrait

Interfaces: ConditionallySpeculatable, NoMemoryEffect (MemoryEffectOpInterface)

Effects: MemoryEffects::Effect{}

Operands:

Operand
Description

input

indices

Results:

Result
Description

output

FHELinalg.from_element (::mlir::concretelang::FHELinalg::FromElementOp)

Creates a tensor with a single element.

Creates a tensor with a single element.

"FHELinalg.from_element"(%a) : (Type) -> tensor<1xType>

Traits: AlwaysSpeculatableImplTrait

Interfaces: ConditionallySpeculatable, NoMemoryEffect (MemoryEffectOpInterface)

Effects: MemoryEffects::Effect{}

Operands:

Operand
Description

«unnamed»

any type

Results:

Result
Description

«unnamed»

FHELinalg.lsb (::mlir::concretelang::FHELinalg::LsbEintOp)

Extract the lowest significant bit at a given precision.

This operation extracts the lsb of a ciphertext tensor in a specific precision.

Extracting only 1 bit:

 // ok
 %lsb = "FHE.lsb"(%a): (tensor<1x!FHE.eint<4>)) -> (tensor<1x!FHE.eint<1>>)

If you need to clear the lsb of the original ciphertext, you should extract to the same precision as the ciphertext.
If you need to extract several bits, you can extract sequentially using explicit bitwidth change and bit clearing.

Example:
```mlir
 // ok
 %a_lsb = "FHELinalg.lsb"(%a): (tensor<1x!FHE.eint<4>)) -> (tensor<1x!FHE.eint<4>))
 %a_lsb_cleared = "FHELinalg.sub_eint"(%a, %lsb) : (tensor<1x!FHE.eint<4>), tensor<1x!FHE.eint<4>)) -> (tensor<1x!FHE.eint<4>))
 %b = %a : tensor<1x!FHE.eint<3>>
 // now you can extract the next lsb from %b
 %b_lsb = "FHELinalg.lsb"(%b): (tensor<1x!FHE.eint<3>>) -> (tensor<1x!FHE.eint<3>>)
 // later if you need %b_lsb at the original position
 %b_lsb_as_in_a = %b_lsb : tensor<1x!FHE.eint<3>>

Traits: AlwaysSpeculatableImplTrait, TensorUnaryEint

Interfaces: ConditionallySpeculatable, ConstantNoise, NoMemoryEffect (MemoryEffectOpInterface), UnaryEint

Effects: MemoryEffects::Effect{}

Operands:

Operand
Description

input

Results:

Result
Description

output

FHELinalg.matmul_eint_eint (::mlir::concretelang::FHELinalg::MatMulEintEintOp)

Returns a tensor that contains the result of the matrix multiplication of a matrix of encrypted integers and a second matrix of encrypted integers.

Performs a matrix multiplication of a matrix of encrypted integers and a second matrix of encrypted integers.

The behavior depends on the arguments in the following way:

- If both arguments are 2-D,
  they are multiplied like conventional matrices.

  e.g.,

  arg0: tensor<MxN> = [...]
  arg1: tensor<NxP> = [...]

  result: tensor<MxP> = [...]

- If the first argument is a vector (1-D),
  it is treated as a matrix with a single row and standard matrix multiplication is performed.

  After standard matrix multiplication,
  the first dimension is removed from the result.

  e.g.,

  arg0: tensor<3> = [x, y, z]
  arg1: tensor<3xM> = [
      [_, _, ..., _, _],
      [_, _, ..., _, _],
      [_, _, ..., _, _],
  ]

  is treated as

  arg0: tensor<1x3> = [
      [x, y, z]
  ]
  arg1: tensor<3xM> = [
      [_, _, ..., _, _],
      [_, _, ..., _, _],
      [_, _, ..., _, _],
  ]

  and matrix multiplication is performed with the following form (1x3 @ 3xM -> 1xM)

  result: tensor<1xM> = [[_, _, ..., _, _]]

  finally, the first dimension is removed by definition so the result has the following form

  result: tensor<M>  = [_, _, ..., _, _]

- If the second argument is 1-D,
  it is treated as a matrix with a single column and standard matrix multiplication is performed.

  After standard matrix multiplication,
  the last dimension is removed from the result.

  e.g.,

  arg0: tensor<Mx3> = [
      [_, _, _],
      [_, _, _],
      ...,
      [_, _, _],
      [_, _, _],
  ]
  arg1: tensor<3> = [x, y, z]

  is treated as

  arg0: tensor<Mx3> = [
      [_, _, _],
      [_, _, _],
      ...,
      [_, _, _],
      [_, _, _],
  ]
  arg1: tensor<3x1> = [
    [x],
    [y],
    [z],
  ]

  and matrix multiplication is performed with the following form (Mx3 @ 3x1 -> Mx1)

  result: tensor<Mx1> = [
    [_],
    [_],
      ...,
    [_],
    [_],
  ]

  finally, the last dimension is removed by definition so the result has the following form

  result: tensor<M> = [_, _, _]

- If either argument is N-D where N > 2,
  the operation is treated as a collection of matrices residing in the last two indices and broadcasted accordingly.

  arg0: tensor<Kx1MxN> = [...]
  arg1: tensor<LxNxP> = [...]

  result: tensor<KxLxMxP> = [...]
"FHELinalg.matmul_eint_eint(%a, %b) : (tensor<MxNx!FHE.eint<p>>, tensor<NxPx!FHE.eint<p>'>) -> tensor<MxPx!FHE.eint<p>>"
"FHELinalg.matmul_eint_eint(%a, %b) : (tensor<KxLxMxNx!FHE.eint<p>>, tensor<KxLxNxPx!FHE.eint<p>'>) -> tensor<KxLxMxPx!FHE.eint<p>>"
"FHELinalg.matmul_eint_eint(%a, %b) : (tensor<MxNx!FHE.eint<p>>, tensor<Nx!FHE.eint<p>'>) -> tensor<Mx!FHE.eint<p>>"
"FHELinalg.matmul_eint_eint(%a, %b) : (tensor<Nx!FHE.eint<p>>, tensor<NxPx!FHE.eint<p>'>) -> tensor<Px!FHE.eint<p>>"

Examples:

// Returns the matrix multiplication of a 3x2 matrix of encrypted integers and a 2x3 matrix of integers.
//         [ 1, 2, 3]
//         [ 2, 3, 4]
//       *
// [1,2]   [ 5, 8,11]
// [3,4] = [11,18,25]
// [5,6]   [17,28,39]
//
"FHELinalg.matmul_eint_eint"(%a, %b) : (tensor<3x2x!FHE.eint<6>>, tensor<2x3x!FHE.eint<6>>) -> tensor<3x3x!FHE.eint<12>>

Traits: AlwaysSpeculatableImplTrait, TensorBinaryEint

Interfaces: ConditionallySpeculatable, NoMemoryEffect (MemoryEffectOpInterface)

Effects: MemoryEffects::Effect{}

Operands:

Operand
Description

lhs

rhs

Results:

Result
Description

«unnamed»

FHELinalg.matmul_eint_int (::mlir::concretelang::FHELinalg::MatMulEintIntOp)

Returns a tensor that contains the result of the matrix multiplication of a matrix of encrypted integers and a matrix of clear integers.

Performs a matrix multiplication of a matrix of encrypted integers and a matrix of clear integers. The width of the clear integers must be less than or equal to the width of encrypted integers.

The behavior depends on the arguments in the following way:

- If both arguments are 2-D,
  they are multiplied like conventional matrices.

  e.g.,

  arg0: tensor<MxN> = [...]
  arg1: tensor<NxP> = [...]

  result: tensor<MxP> = [...]

- If the first argument is a vector (1-D),
  it is treated as a matrix with a single row and standard matrix multiplication is performed.

  After standard matrix multiplication,
  the first dimension is removed from the result.

  e.g.,

  arg0: tensor<3> = [x, y, z]
  arg1: tensor<3xM> = [
      [_, _, ..., _, _],
      [_, _, ..., _, _],
      [_, _, ..., _, _],
  ]

  is treated as

  arg0: tensor<1x3> = [
      [x, y, z]
  ]
  arg1: tensor<3xM> = [
      [_, _, ..., _, _],
      [_, _, ..., _, _],
      [_, _, ..., _, _],
  ]

  and matrix multiplication is performed with the following form (1x3 @ 3xM -> 1xM)

  result: tensor<1xM> = [[_, _, ..., _, _]]

  finally, the first dimension is removed by definition so the result has the following form

  result: tensor<M>  = [_, _, ..., _, _]

- If the second argument is 1-D,
  it is treated as a matrix with a single column and standard matrix multiplication is performed.

  After standard matrix multiplication,
  the last dimension is removed from the result.

  e.g.,

  arg0: tensor<Mx3> = [
      [_, _, _],
      [_, _, _],
      ...,
      [_, _, _],
      [_, _, _],
  ]
  arg1: tensor<3> = [x, y, z]

  is treated as

  arg0: tensor<Mx3> = [
      [_, _, _],
      [_, _, _],
      ...,
      [_, _, _],
      [_, _, _],
  ]
  arg1: tensor<3x1> = [
    [x],
    [y],
    [z],
  ]

  and matrix multiplication is performed with the following form (Mx3 @ 3x1 -> Mx1)

  result: tensor<Mx1> = [
    [_],
    [_],
      ...,
    [_],
    [_],
  ]

  finally, the last dimension is removed by definition so the result has the following form

  result: tensor<M> = [_, _, _]

- If either argument is N-D where N > 2,
  the operation is treated as a collection of matrices residing in the last two indices and broadcasted accordingly.

  arg0: tensor<Kx1MxN> = [...]
  arg1: tensor<LxNxP> = [...]

  result: tensor<KxLxMxP> = [...]
"FHELinalg.matmul_eint_int(%a, %b) : (tensor<MxNx!FHE.eint<p>>, tensor<NxPxip'>) -> tensor<MxPx!FHE.eint<p>>"
"FHELinalg.matmul_eint_int(%a, %b) : (tensor<KxLxMxNx!FHE.eint<p>>, tensor<KxLxNxPxip'>) -> tensor<KxLxMxPx!FHE.eint<p>>"
"FHELinalg.matmul_eint_int(%a, %b) : (tensor<MxNx!FHE.eint<p>>, tensor<Nxip'>) -> tensor<Mx!FHE.eint<p>>"
"FHELinalg.matmul_eint_int(%a, %b) : (tensor<Nx!FHE.eint<p>>, tensor<NxPxip'>) -> tensor<Px!FHE.eint<p>>"

Examples:

// Returns the matrix multiplication of a 3x2 matrix of encrypted integers and a 2x3 matrix of integers.
//         [ 1, 2, 3]
//         [ 2, 3, 4]
//       *
// [1,2]   [ 5, 8,11]
// [3,4] = [11,18,25]
// [5,6]   [17,28,39]
//
"FHELinalg.matmul_eint_int"(%a, %b) : (tensor<3x2x!FHE.eint<6>>, tensor<2x3xi7>) -> tensor<3x3x!FHE.eint<6>>

Traits: AlwaysSpeculatableImplTrait, TensorBinaryEintInt

Interfaces: Binary, BinaryEintInt, ConditionallySpeculatable, NoMemoryEffect (MemoryEffectOpInterface)

Effects: MemoryEffects::Effect{}

Operands:

Operand
Description

lhs

rhs

Results:

Result
Description

«unnamed»

FHELinalg.matmul_int_eint (::mlir::concretelang::FHELinalg::MatMulIntEintOp)

Returns a tensor that contains the result of the matrix multiplication of a matrix of clear integers and a matrix of encrypted integers.

Performs a matrix multiplication of a matrix of clear integers and a matrix of encrypted integers. The width of the clear integers must be less than or equal to the width of encrypted integers.

The behavior depends on the arguments in the following way:

- If both arguments are 2-D,
  they are multiplied like conventional matrices.

  e.g.,

  arg0: tensor<MxN> = [...]
  arg1: tensor<NxP> = [...]

  result: tensor<MxP> = [...]

- If the first argument is a vector (1-D),
  it is treated as a matrix with a single row and standard matrix multiplication is performed.

  After standard matrix multiplication,
  the first dimension is removed from the result.

  e.g.,

  arg0: tensor<3> = [x, y, z]
  arg1: tensor<3xM> = [
      [_, _, ..., _, _],
      [_, _, ..., _, _],
      [_, _, ..., _, _],
  ]

  is treated as

  arg0: tensor<1x3> = [
      [x, y, z]
  ]
  arg1: tensor<3xM> = [
      [_, _, ..., _, _],
      [_, _, ..., _, _],
      [_, _, ..., _, _],
  ]

  and matrix multiplication is performed with the following form (1x3 @ 3xM -> 1xM)

  result: tensor<1xM> = [[_, _, ..., _, _]]

  finally, the first dimension is removed by definition so the result has the following form

  result: tensor<M>  = [_, _, ..., _, _]

- If the second argument is 1-D,
  it is treated as a matrix with a single column and standard matrix multiplication is performed.

  After standard matrix multiplication,
  the last dimension is removed from the result.

  e.g.,

  arg0: tensor<Mx3> = [
      [_, _, _],
      [_, _, _],
      ...,
      [_, _, _],
      [_, _, _],
  ]
  arg1: tensor<3> = [x, y, z]

  is treated as

  arg0: tensor<Mx3> = [
      [_, _, _],
      [_, _, _],
      ...,
      [_, _, _],
      [_, _, _],
  ]
  arg1: tensor<3x1> = [
    [x],
    [y],
    [z],
  ]

  and matrix multiplication is performed with the following form (Mx3 @ 3x1 -> Mx1)

  result: tensor<Mx1> = [
    [_],
    [_],
      ...,
    [_],
    [_],
  ]

  finally, the last dimension is removed by definition so the result has the following form

  result: tensor<M> = [_, _, _]

- If either argument is N-D where N > 2,
  the operation is treated as a collection of matrices residing in the last two indices and broadcasted accordingly.

  arg0: tensor<Kx1MxN> = [...]
  arg1: tensor<LxNxP> = [...]

  result: tensor<KxLxMxP> = [...]
"FHELinalg.matmul_int_eint(%a, %b) : (tensor<MxNxip'>, tensor<NxPxFHE.eint<p>>) -> tensor<MxPx!FHE.eint<p>>"
"FHELinalg.matmul_int_eint(%a, %b) : (tensor<KxLxMxNxip'>, tensor<KxLxNxPxFHE.eint<p>>) -> tensor<KxLxMxPx!FHE.eint<p>>"
"FHELinalg.matmul_int_eint(%a, %b) : (tensor<MxNxip'>, tensor<NxFHE.eint<p>>) -> tensor<Mx!FHE.eint<p>>"
"FHELinalg.matmul_int_eint(%a, %b) : (tensor<Nxip'>, tensor<NxPxFHE.eint<p>>) -> tensor<Px!FHE.eint<p>>"

Examples:

// Returns the matrix multiplication of a 3x2 matrix of clear integers and a 2x3 matrix of encrypted integers.
//         [ 1, 2, 3]
//         [ 2, 3, 4]
//       *
// [1,2]   [ 5, 8,11]
// [3,4] = [11,18,25]
// [5,6]   [17,28,39]
//
"FHELinalg.matmul_int_eint"(%a, %b) : (tensor<3x2xi7>, tensor<2x3x!FHE.eint<6>>) -> tensor<3x3x!FHE.eint<6>>

Traits: AlwaysSpeculatableImplTrait, TensorBinaryIntEint

Interfaces: Binary, BinaryIntEint, ConditionallySpeculatable, NoMemoryEffect (MemoryEffectOpInterface)

Effects: MemoryEffects::Effect{}

Operands:

Operand
Description

lhs

rhs

Results:

Result
Description

«unnamed»

FHELinalg.maxpool2d (::mlir::concretelang::FHELinalg::Maxpool2dOp)

Returns the 2D maxpool of a tensor in the form NCHW

Interfaces: UnaryEint

Attributes:

Attribute
MLIR Type
Description

kernel_shape

::mlir::DenseIntElementsAttr

64-bit signless integer elements attribute

strides

::mlir::DenseIntElementsAttr

64-bit signless integer elements attribute

dilations

::mlir::DenseIntElementsAttr

64-bit signless integer elements attribute

Operands:

Operand
Description

input

Results:

Result
Description

«unnamed»

FHELinalg.mul_eint_int (::mlir::concretelang::FHELinalg::MulEintIntOp)

Returns a tensor that contains the multiplication of a tensor of encrypted integers and a tensor of clear integers.

Performs a multiplication following the broadcasting rules between a tensor of encrypted integers and a tensor of clear integers. The width of the clear integers must be less than or equal to the width of encrypted integers.

Examples:

// Returns the term-by-term multiplication of `%a0` with `%a1`
"FHELinalg.mul_eint_int"(%a0, %a1) : (tensor<4x!FHE.eint<4>>, tensor<4xi5>) -> tensor<4x!FHE.eint<4>>

// Returns the term-by-term multiplication of `%a0` with `%a1`, where dimensions equal to one are stretched.
"FHELinalg.mul_eint_int"(%a0, %a1) : (tensor<4x1x4x!FHE.eint<4>>, tensor<1x4x4xi5>) -> tensor<4x4x4x!FHE.eint<4>>

// Returns the multiplication of a 3x3 matrix of encrypted integers and a 3x1 matrix (a column) of integers.
//
// [1,2,3]   [1]   [1,2,3]
// [4,5,6] * [2] = [8,10,18]
// [7,8,9]   [3]   [21,24,27]
//
// The dimension #1 of operand #2 is stretched as it is equal to 1.
"FHELinalg.mul_eint_int"(%a0, %a1) : (tensor<3x3x!FHE.eint<4>>, tensor<3x1xi5>) -> tensor<3x3x!FHE.eint<4>>

// Returns the multiplication of a 3x3 matrix of encrypted integers and a 1x3 matrix (a line) of integers.
//
// [1,2,3]             [2,4,6]
// [4,5,6] * [1,2,3] = [5,7,9]
// [7,8,9]             [8,10,12]
//
// The dimension #2 of operand #2 is stretched as it is equal to 1.
"FHELinalg.mul_eint_int"(%a0, %a1) : (tensor<3x3x!FHE.eint<4>>, tensor<1x3xi5>) -> tensor<3x3x!FHE.eint<4>>

// Same behavior as the previous one, but as the dimension #2 is missing of operand #2.
"FHELinalg.mul_eint_int"(%a0, %a1) : (tensor<3x3x!FHE.eint<4>>, tensor<3xi5>) -> tensor<3x3x!FHE.eint<4>>

Traits: AlwaysSpeculatableImplTrait, TensorBinaryEintInt, TensorBroadcastingRules

Interfaces: Binary, BinaryEintInt, ConditionallySpeculatable, NoMemoryEffect (MemoryEffectOpInterface)

Effects: MemoryEffects::Effect{}

Operands:

Operand
Description

lhs

rhs

Results:

Result
Description

«unnamed»

FHELinalg.mul_eint (::mlir::concretelang::FHELinalg::MulEintOp)

Returns a tensor that contains the multiplication of two tensor of encrypted integers.

Performs an addition following the broadcasting rules between two tensors of encrypted integers. The width of the encrypted integers must be equal.

Examples:

// Returns the term-by-term multiplication of `%a0` with `%a1`
"FHELinalg.mul_eint"(%a0, %a1) : (tensor<4x!FHE.eint<8>>, tensor<4x!FHE.eint<8>>) -> tensor<4x!FHE.eint<8>>

// Returns the term-by-term multiplication of `%a0` with `%a1`, where dimensions equal to one are stretched.
"FHELinalg.mul_eint"(%a0, %a1) : (tensor<4x1x4x!FHE.eint<8>>, tensor<1x4x4x!FHE.eint<8>>) -> tensor<4x4x4x!FHE.eint<8>>

// Returns the multiplication of a 3x3 matrix of encrypted integers and a 3x1 matrix (a column) of encrypted integers.
//
// [1,2,3]   [1]   [1,2,3]
// [4,5,6] * [2] = [8,10,12]
// [7,8,9]   [3]   [21,24,27]
//
// The dimension #1 of operand #2 is stretched as it is equal to 1.
"FHELinalg.mul_eint"(%a0, %a1) : (tensor<3x3x!FHE.eint<8>>, tensor<3x1x!FHE.eint<8>>) -> tensor<3x3x!FHE.eint<8>>

// Returns the multiplication of a 3x3 matrix of encrypted integers and a 1x3 matrix (a line) of encrypted integers.
//
// [1,2,3]             [1,4,9]
// [4,5,6] * [1,2,3] = [4,10,18]
// [7,8,9]             [7,16,27]
//
// The dimension #2 of operand #2 is stretched as it is equal to 1.
"FHELinalg.mul_eint"(%a0, %a1) : (tensor<3x3x!FHE.eint<8>>, tensor<1x3x!FHE.eint<8>>) -> tensor<3x3x!FHE.eint<8>>

// Same behavior as the previous one, but as the dimension #2 of operand #2 is missing.
"FHELinalg.mul_eint"(%a0, %a1) : (tensor<3x3x!FHE.eint<8>>, tensor<3x!FHE.eint<8>>) -> tensor<3x3x!FHE.eint<8>>

Traits: AlwaysSpeculatableImplTrait, TensorBinaryEint, TensorBroadcastingRules

Interfaces: BinaryEint, ConditionallySpeculatable, NoMemoryEffect (MemoryEffectOpInterface)

Effects: MemoryEffects::Effect{}

Operands:

Operand
Description

lhs

rhs

Results:

Result
Description

«unnamed»

FHELinalg.neg_eint (::mlir::concretelang::FHELinalg::NegEintOp)

Returns a tensor that contains the negation of a tensor of encrypted integers.

Performs a negation to a tensor of encrypted integers.

Examples:

// Returns the term-by-term negation of `%a0`
"FHELinalg.neg_eint"(%a0) : (tensor<3x3x!FHE.eint<4>>) -> tensor<3x3x!FHE.eint<4>>
//
//        ( [1,2,3] )   [31,30,29]
// negate ( [4,5,6] ) = [28,27,26]
//        ( [7,8,9] )   [25,24,23]
//
// The negation is computed as `2**(p+1) - a` where p=4 here.

Traits: AlwaysSpeculatableImplTrait, TensorUnaryEint

Interfaces: ConditionallySpeculatable, NoMemoryEffect (MemoryEffectOpInterface), UnaryEint

Effects: MemoryEffects::Effect{}

Operands:

Operand
Description

input

Results:

Result
Description

«unnamed»

FHELinalg.reinterpret_precision (::mlir::concretelang::FHELinalg::ReinterpretPrecisionEintOp)

Reinterpret the ciphertext tensor with a different precision.

It's a reinterpretation cast which changes only the precision. On CRT represention, it does nothing. On Native representation, it moves the message/noise further forward, effectively changing the precision. Changing to - a bigger precision is safe, as the crypto-parameters are chosen such that only zeros will come from the noise part. This is equivalent to a shift left for the value - a smaller precision is only safe if you clear the lowest message bits first. If not, you can assume small errors with high probability and frequent bigger errors, which can be contained to small errors using margins. This is equivalent to a shift right for the value

Example:

 // assuming a is encoded as 4bits but can be stored in 2bits
 // we can obtain a to a smaller 2 bits precision
 %shifted_a = "FHELinalg.mul_eint_intlsb"(%a, %c_4): (tensor<1x!FHE.eint<4>>) -> (tensor<1x!FHE.eint<2>>)
 %a_small_precision = "FHELinalg.reinterpret_precision"(%shifted_a, %lsb) : (tensor<1x!FHE.eint<4>>) -> (tensor<1x!FHE.eint<2>>)

Traits: AlwaysSpeculatableImplTrait, TensorUnaryEint

Interfaces: ConditionallySpeculatable, NoMemoryEffect (MemoryEffectOpInterface), UnaryEint

Effects: MemoryEffects::Effect{}

Operands:

Operand
Description

input

Results:

Result
Description

output

FHELinalg.round (::mlir::concretelang::FHELinalg::RoundOp)

Rounds a tensor of ciphertexts into a smaller precision.

  Assuming a ciphertext whose message is implemented over `p` bits, this
  operation rounds it to fit to `q` bits where `p>q`.

  Example:
  ```mlir
  // ok
  "FHELinalg.round"(%a): (tensor<3x!FHE.eint<6>>) -> (tensor<3x!FHE.eint<5>>)
  "FHELinalg.round"(%a): (tensor<3x!FHE.eint<5>>) -> (tensor<3x!FHE.eint<3>>)
  "FHELinalg.round"(%a): (tensor<3x!FHE.eint<3>>) -> (tensor<3x!FHE.eint<2>>)
  "FHELinalg.round"(%a): (tensor<3x!FHE.esint<3>>) -> (tensor<3x!FHE.esint<2>>)

  // error
  "FHELinalg.round"(%a): (tensor<3x!FHE.eint<6>>) -> (tensor<3x!FHE.eint<6>>)
  "FHELinalg.round"(%a): (tensor<3x!FHE.eint<4>>) -> (tensor<3x!FHE.eint<5>>)
  "FHELinalg.round"(%a): (tensor<3x!FHE.eint<4>>) -> (tensor<3x!FHE.esint<2>>)

Traits: AlwaysSpeculatableImplTrait, TensorUnaryEint

Interfaces: ConditionallySpeculatable, NoMemoryEffect (MemoryEffectOpInterface), UnaryEint

Effects: MemoryEffects::Effect{}

#### Operands:

| Operand | Description |
| :-----: | ----------- |
| `input` | 

#### Results:

| Result | Description |
| :----: | ----------- |
| `output` | 

### `FHELinalg.sub_eint_int` (::mlir::concretelang::FHELinalg::SubEintIntOp)

Returns a tensor that contains the subtraction of a tensor of clear integers from a tensor of encrypted integers.

Performs a subtraction following the broadcasting rules between a tensor of clear integers from a tensor of encrypted integers.
The width of the clear integers must be less than or equal to the width of encrypted integers.

Examples:
```mlir
// Returns the term-by-term subtraction of `%a0` with `%a1`
"FHELinalg.sub_eint_int"(%a0, %a1) : (tensor<4x!FHE.eint<4>>, tensor<4xi5>) -> tensor<4x!FHE.eint<4>>

// Returns the term-by-term subtraction of `%a0` with `%a1`, where dimensions equal to one are stretched.
"FHELinalg.sub_eint_int"(%a0, %a1) : (tensor<1x4x4x!FHE.eint<4>>, tensor<4x1x4xi5>) -> tensor<4x4x4x!FHE.eint<4>>

// Returns the subtraction of a 3x3 matrix of integers and a 3x1 matrix (a column) of encrypted integers.
//
// [1,2,3]   [1]   [0,2,3]
// [4,5,6] - [2] = [2,3,4]
// [7,8,9]   [3]   [4,5,6]
//
// The dimension #1 of operand #2 is stretched as it is equal to 1.
"FHELinalg.sub_eint_int"(%a0, %a1) : (tensor<3x1x!FHE.eint<4>>, tensor<3x3xi5>) -> tensor<3x3x!FHE.eint<4>>

// Returns the subtraction of a 3x3 matrix of integers and a 1x3 matrix (a line) of encrypted integers.
//
// [1,2,3]             [0,0,0]
// [4,5,6] - [1,2,3] = [3,3,3]
// [7,8,9]             [6,6,6]
//
// The dimension #2 of operand #2 is stretched as it is equal to 1.
"FHELinalg.sub_eint_int"(%a0, %a1) : (tensor<1x3x!FHE.eint<4>>, tensor<3x3xi5>) -> tensor<3x3x!FHE.eint<4>>

// Same behavior as the previous one, but as the dimension #2 is missing of operand #2.
"FHELinalg.sub_eint_int"(%a0, %a1) : (tensor<3x!FHE.eint<4>>, tensor<3x3xi5>) -> tensor<3x3x!FHE.eint<4>>

Traits: AlwaysSpeculatableImplTrait, TensorBinaryEintInt, TensorBroadcastingRules

Interfaces: Binary, BinaryEintInt, ConditionallySpeculatable, NoMemoryEffect (MemoryEffectOpInterface)

Effects: MemoryEffects::Effect{}

Operands:

Operand
Description

lhs

rhs

Results:

Result
Description

«unnamed»

FHELinalg.sub_eint (::mlir::concretelang::FHELinalg::SubEintOp)

Returns a tensor that contains the subtraction of two tensor of encrypted integers.

Performs an subtraction following the broadcasting rules between two tensors of encrypted integers. The width of the encrypted integers must be equal.

Examples:

// Returns the term-by-term subtraction of `%a0` with `%a1`
"FHELinalg.sub_eint"(%a0, %a1) : (tensor<4x!FHE.eint<4>>, tensor<4x!FHE.eint<4>>) -> tensor<4x!FHE.eint<4>>

// Returns the term-by-term subtraction of `%a0` with `%a1`, where dimensions equal to one are stretched.
"FHELinalg.sub_eint"(%a0, %a1) : (tensor<4x1x4x!FHE.eint<4>>, tensor<1x4x4x!FHE.eint<4>>) -> tensor<4x4x4x!FHE.eint<4>>

// Returns the subtraction of a 3x3 matrix of integers and a 3x1 matrix (a column) of encrypted integers.
//
// [1,2,3]   [1]   [0,2,3]
// [4,5,6] - [2] = [2,3,4]
// [7,8,9]   [3]   [4,5,6]
//
// The dimension #1 of operand #2 is stretched as it is equal to 1.
"FHELinalg.sub_eint"(%a0, %a1) : (tensor<3x3x!FHE.eint<4>>, tensor<3x1x!FHE.eint<4>>) -> tensor<3x3x!FHE.eint<4>>

// Returns the subtraction of a 3x3 matrix of integers and a 1x3 matrix (a line) of encrypted integers.
//
// [1,2,3]             [0,0,0]
// [4,5,6] - [1,2,3] = [3,3,3]
// [7,8,9]             [6,6,6]
//
// The dimension #2 of operand #2 is stretched as it is equal to 1.
"FHELinalg.sub_eint"(%a0, %a1) : (tensor<3x3x!FHE.eint<4>>, tensor<1x3x!FHE.eint<4>>) -> tensor<3x3x!FHE.eint<4>>

// Same behavior as the previous one, but as the dimension #2 of operand #2 is missing.
"FHELinalg.sub_eint"(%a0, %a1) : (tensor<3x3x!FHE.eint<4>>, tensor<3x!FHE.eint<4>>) -> tensor<3x3x!FHE.eint<4>>

Traits: AlwaysSpeculatableImplTrait, TensorBinaryEint, TensorBroadcastingRules

Interfaces: BinaryEint, ConditionallySpeculatable, NoMemoryEffect (MemoryEffectOpInterface)

Effects: MemoryEffects::Effect{}

Operands:

Operand
Description

lhs

rhs

Results:

Result
Description

«unnamed»

FHELinalg.sub_int_eint (::mlir::concretelang::FHELinalg::SubIntEintOp)

Returns a tensor that contains the subtraction of a tensor of clear integers and a tensor of encrypted integers.

Performs a subtraction following the broadcasting rules between a tensor of clear integers and a tensor of encrypted integers. The width of the clear integers must be less than or equal to the width of encrypted integers.

Examples:

// Returns the term-by-term subtraction of `%a0` with `%a1`
"FHELinalg.sub_int_eint"(%a0, %a1) : (tensor<4xi5>, tensor<4x!FHE.eint<4>>) -> tensor<4x!FHE.eint<4>>

// Returns the term-by-term subtraction of `%a0` with `%a1`, where dimensions equal to one are stretched.
"FHELinalg.sub_int_eint"(%a0, %a1) : (tensor<4x1x4xi5>, tensor<1x4x4x!FHE.eint<4>>) -> tensor<4x4x4x!FHE.eint<4>>

// Returns the subtraction of a 3x3 matrix of integers and a 3x1 matrix (a column) of encrypted integers.
//
// [1,2,3]   [1]   [0,2,3]
// [4,5,6] - [2] = [2,3,4]
// [7,8,9]   [3]   [4,5,6]
//
// The dimension #1 of operand #2 is stretched as it is equal to 1.
"FHELinalg.sub_int_eint"(%a0, %a1) : (tensor<3x3xi5>, tensor<3x1x!FHE.eint<4>>) -> tensor<3x3x!FHE.eint<4>>

// Returns the subtraction of a 3x3 matrix of integers and a 1x3 matrix (a line) of encrypted integers.
//
// [1,2,3]             [0,0,0]
// [4,5,6] - [1,2,3] = [3,3,3]
// [7,8,9]             [6,6,6]
//
// The dimension #2 of operand #2 is stretched as it is equal to 1.
"FHELinalg.sub_int_eint"(%a0, %a1) : (tensor<3x3xi5>, tensor<1x3x!FHE.eint<4>>) -> tensor<3x3x!FHE.eint<4>>

// Same behavior as the previous one, but as the dimension #2 is missing of operand #2.
"FHELinalg.sub_int_eint"(%a0, %a1) : (tensor<3x3xi5>, tensor<3x!FHE.eint<4>>) -> tensor<3x3x!FHE.eint<4>>

Traits: AlwaysSpeculatableImplTrait, TensorBinaryIntEint, TensorBroadcastingRules

Interfaces: Binary, BinaryIntEint, ConditionallySpeculatable, NoMemoryEffect (MemoryEffectOpInterface)

Effects: MemoryEffects::Effect{}

Operands:

Operand
Description

lhs

rhs

Results:

Result
Description

«unnamed»

FHELinalg.sum (::mlir::concretelang::FHELinalg::SumOp)

Returns the sum of elements of a tensor of encrypted integers along specified axes.

Attributes:

  • keep_dims: boolean = false whether to keep the rank of the tensor after the sum operation if true, reduced axes will have the size of 1

  • axes: I64ArrayAttr = [] list of dimension to perform the sum along think of it as the dimensions to reduce (see examples below to get an intuition)

Examples:

// Returns the sum of all elements of `%a0`
"FHELinalg.sum"(%a0) : (tensor<3x3x!FHE.eint<4>>) -> !FHE.eint<4>
//
//     ( [1,2,3] )
// sum ( [4,5,6] ) = 45
//     ( [7,8,9] )
//
// Returns the sum of all elements of `%a0` along columns
"FHELinalg.sum"(%a0) { axes = [0] } : (tensor<3x2x!FHE.eint<4>>) -> tensor<2x!FHE.eint<4>>
//
//     ( [1,2] )
// sum ( [3,4] ) = [9, 12]
//     ( [5,6] )
//
// Returns the sum of all elements of `%a0` along columns while preserving dimensions
"FHELinalg.sum"(%a0) { axes = [0], keep_dims = true } : (tensor<3x2x!FHE.eint<4>>) -> tensor<1x2x!FHE.eint<4>>
//
//     ( [1,2] )
// sum ( [3,4] ) = [[9, 12]]
//     ( [5,6] )
//
// Returns the sum of all elements of `%a0` along rows
"FHELinalg.sum"(%a0) { axes = [1] } : (tensor<3x2x!FHE.eint<4>>) -> tensor<3x!FHE.eint<4>>
//
//     ( [1,2] )
// sum ( [3,4] ) = [3, 7, 11]
//     ( [5,6] )
//
// Returns the sum of all elements of `%a0` along rows while preserving dimensions
"FHELinalg.sum"(%a0) { axes = [1], keep_dims = true } : (tensor<3x2x!FHE.eint<4>>) -> tensor<3x1x!FHE.eint<4>>
//
//     ( [1,2] )   [3]
// sum ( [3,4] ) = [7]
//     ( [5,6] )   [11]
//

Traits: AlwaysSpeculatableImplTrait, TensorUnaryEint

Interfaces: ConditionallySpeculatable, NoMemoryEffect (MemoryEffectOpInterface)

Effects: MemoryEffects::Effect{}

Attributes:

Attribute
MLIR Type
Description

axes

::mlir::ArrayAttr

64-bit integer array attribute

keep_dims

::mlir::BoolAttr

bool attribute

Operands:

Operand
Description

tensor

Results:

Result
Description

out

FHELinalg.to_signed (::mlir::concretelang::FHELinalg::ToSignedOp)

Cast an unsigned integer tensor to a signed one

Cast an unsigned integer tensor to a signed one. The result must have the same width and the same shape as the input.

The behavior is undefined on overflow/underflow.

Examples:

// ok
"FHELinalg.to_signed"(%x) : (tensor<3x2x!FHE.eint<2>>) -> tensor<3x2x!FHE.esint<2>>

// error
"FHELinalg.to_signed"(%x) : (tensor<3x2x!FHE.eint<2>>) -> tensor<3x2x!FHE.esint<3>>

Traits: AlwaysSpeculatableImplTrait

Interfaces: ConditionallySpeculatable, NoMemoryEffect (MemoryEffectOpInterface), UnaryEint

Effects: MemoryEffects::Effect{}

Operands:

Operand
Description

input

Results:

Result
Description

output

FHELinalg.to_unsigned (::mlir::concretelang::FHELinalg::ToUnsignedOp)

Cast a signed integer tensor to an unsigned one

Cast a signed integer tensor to an unsigned one. The result must have the same width and the same shape as the input.

The behavior is undefined on overflow/underflow.

Examples:

// ok
"FHELinalg.to_unsigned"(%x) : (tensor<3x2x!FHE.esint<2>>) -> tensor<3x2x!FHE.eint<2>>

// error
"FHELinalg.to_unsigned"(%x) : (tensor<3x2x!FHE.esint<2>>) -> tensor<3x2x!FHE.eint<3>>

Traits: AlwaysSpeculatableImplTrait

Interfaces: ConditionallySpeculatable, NoMemoryEffect (MemoryEffectOpInterface), UnaryEint

Effects: MemoryEffects::Effect{}

Operands:

Operand
Description

input

Results:

Result
Description

output

FHELinalg.transpose (::mlir::concretelang::FHELinalg::TransposeOp)

Returns a tensor that contains the transposition of the input tensor.

Performs a transpose operation on an N-dimensional tensor.

Attributes:

  • axes: I64ArrayAttr = [] list of dimension to perform the transposition contains a permutation of [0,1,..,N-1] where N is the number of axes think of it as a way to rearrange axes (see the example below)

"FHELinalg.transpose"(%a) : (tensor<n0xn1x...xnNxType>) -> tensor<nNx...xn1xn0xType>

Examples:

// Transpose the input tensor
// [1,2]    [1, 3, 5]
// [3,4] => [2, 4, 6]
// [5,6]
//
"FHELinalg.transpose"(%a) : (tensor<3x2xi7>) -> tensor<2x3xi7>
"FHELinalg.transpose"(%a) { axes = [1, 3, 0, 2] } : (tensor<2x3x4x5xi7>) -> tensor<3x5x2x4xi7>

Traits: AlwaysSpeculatableImplTrait

Interfaces: ConditionallySpeculatable, NoMemoryEffect (MemoryEffectOpInterface), UnaryEint

Effects: MemoryEffects::Effect{}

Attributes:

Attribute
MLIR Type
Description

axes

::mlir::ArrayAttr

64-bit integer array attribute

Operands:

Operand
Description

tensor

any type

Results:

Result
Description

«unnamed»

any type

FHE dialect

High Level Fully Homomorphic Encryption dialect A dialect for representation of high level operation on fully homomorphic ciphertext.

Operation definition

FHE.add_eint_int (::mlir::concretelang::FHE::AddEintIntOp)

Adds an encrypted integer and a clear integer

The clear integer must have at most one more bit than the encrypted integer and the result must have the same width and the same signedness as the encrypted integer.

Example:

// ok
"FHE.add_eint_int"(%a, %i) : (!FHE.eint<2>, i3) -> !FHE.eint<2>
"FHE.add_eint_int"(%a, %i) : (!FHE.esint<2>, i3) -> !FHE.esint<2>

// error
"FHE.add_eint_int"(%a, %i) : (!FHE.eint<2>, i4) -> !FHE.eint<2>
"FHE.add_eint_int"(%a, %i) : (!FHE.eint<2>, i3) -> !FHE.eint<3>
"FHE.add_eint_int"(%a, %i) : (!FHE.eint<2>, i3) -> !FHE.esint<2>

Traits: AlwaysSpeculatableImplTrait

Interfaces: AdditiveNoise, Binary, BinaryEintInt, ConditionallySpeculatable, NoMemoryEffect (MemoryEffectOpInterface)

Effects: MemoryEffects::Effect{}

Operands:

Operand
Description

a

b

integer

Results:

Result
Description

«unnamed»

FHE.add_eint (::mlir::concretelang::FHE::AddEintOp)

Adds two encrypted integers

The encrypted integers and the result must have the same width and the same signedness.

Example:

// ok
"FHE.add_eint"(%a, %b): (!FHE.eint<2>, !FHE.eint<2>) -> (!FHE.eint<2>)
"FHE.add_eint"(%a, %b): (!FHE.esint<2>, !FHE.esint<2>) -> (!FHE.esint<2>)

// error
"FHE.add_eint"(%a, %b): (!FHE.eint<2>, !FHE.eint<3>) -> (!FHE.eint<2>)
"FHE.add_eint"(%a, %b): (!FHE.eint<2>, !FHE.eint<2>) -> (!FHE.eint<3>)
"FHE.add_eint"(%a, %b): (!FHE.eint<2>, !FHE.eint<2>) -> (!FHE.esint<2>)
"FHE.add_eint"(%a, %b): (!FHE.esint<2>, !FHE.eint<2>) -> (!FHE.eint<2>)

Traits: AlwaysSpeculatableImplTrait

Interfaces: AdditiveNoise, BinaryEint, ConditionallySpeculatable, NoMemoryEffect (MemoryEffectOpInterface)

Effects: MemoryEffects::Effect{}

Operands:

Operand
Description

a

b

Results:

Result
Description

«unnamed»

FHE.apply_lookup_table (::mlir::concretelang::FHE::ApplyLookupTableEintOp)

Applies a clear lookup table to an encrypted integer

The width of the result can be different than the width of the operand. The lookup table must be a tensor of size 2^p where p is the width of the encrypted integer.

Example:

// ok
"FHE.apply_lookup_table"(%a, %lut): (!FHE.eint<2>, tensor<4xi64>) -> (!FHE.eint<2>)
"FHE.apply_lookup_table"(%a, %lut): (!FHE.eint<2>, tensor<4xi64>) -> (!FHE.eint<3>)
"FHE.apply_lookup_table"(%a, %lut): (!FHE.eint<3>, tensor<4xi64>) -> (!FHE.eint<2>)

// error
"FHE.apply_lookup_table"(%a, %lut): (!FHE.eint<2>, tensor<8xi64>) -> (!FHE.eint<2>)

Traits: AlwaysSpeculatableImplTrait

Interfaces: ConditionallySpeculatable, ConstantNoise, NoMemoryEffect (MemoryEffectOpInterface)

Effects: MemoryEffects::Effect{}

Operands:

Operand
Description

a

lut

tensor of integer values

Results:

Result
Description

«unnamed»

FHE.and (::mlir::concretelang::FHE::BoolAndOp)

Applies an AND gate to two encrypted boolean values

Example:

"FHE.and"(%a, %b): (!FHE.ebool, !FHE.ebool) -> (!FHE.ebool)

Traits: AlwaysSpeculatableImplTrait

Interfaces: ConditionallySpeculatable, NoMemoryEffect (MemoryEffectOpInterface)

Effects: MemoryEffects::Effect{}

Operands:

Operand
Description

left

An encrypted boolean

right

An encrypted boolean

Results:

Result
Description

«unnamed»

An encrypted boolean

FHE.nand (::mlir::concretelang::FHE::BoolNandOp)

Applies a NAND gate to two encrypted boolean values

Example:

"FHE.nand"(%a, %b): (!FHE.ebool, !FHE.ebool) -> (!FHE.ebool)

Traits: AlwaysSpeculatableImplTrait

Interfaces: ConditionallySpeculatable, NoMemoryEffect (MemoryEffectOpInterface)

Effects: MemoryEffects::Effect{}

Operands:

Operand
Description

left

An encrypted boolean

right

An encrypted boolean

Results:

Result
Description

«unnamed»

An encrypted boolean

FHE.not (::mlir::concretelang::FHE::BoolNotOp)

Applies a NOT gate to an encrypted boolean value

Example:

"FHE.not"(%a): (!FHE.ebool) -> (!FHE.ebool)

Traits: AlwaysSpeculatableImplTrait

Interfaces: ConditionallySpeculatable, NoMemoryEffect (MemoryEffectOpInterface), UnaryEint

Effects: MemoryEffects::Effect{}

Operands:

Operand
Description

value

An encrypted boolean

Results:

Result
Description

«unnamed»

An encrypted boolean

FHE.or (::mlir::concretelang::FHE::BoolOrOp)

Applies an OR gate to two encrypted boolean values

Example:

"FHE.or"(%a, %b): (!FHE.ebool, !FHE.ebool) -> (!FHE.ebool)

Traits: AlwaysSpeculatableImplTrait

Interfaces: ConditionallySpeculatable, NoMemoryEffect (MemoryEffectOpInterface)

Effects: MemoryEffects::Effect{}

Operands:

Operand
Description

left

An encrypted boolean

right

An encrypted boolean

Results:

Result
Description

«unnamed»

An encrypted boolean

FHE.xor (::mlir::concretelang::FHE::BoolXorOp)

Applies an XOR gate to two encrypted boolean values

Example:

"FHE.xor"(%a, %b): (!FHE.ebool, !FHE.ebool) -> (!FHE.ebool)

Traits: AlwaysSpeculatableImplTrait

Interfaces: ConditionallySpeculatable, NoMemoryEffect (MemoryEffectOpInterface)

Effects: MemoryEffects::Effect{}

Operands:

Operand
Description

left

An encrypted boolean

right

An encrypted boolean

Results:

Result
Description

«unnamed»

An encrypted boolean

FHE.change_partition (::mlir::concretelang::FHE::ChangePartitionEintOp)

Change partition if necessary.

Changing the partition of a ciphertext. If necessary, it keyswitch the ciphertext to a different key having a different set of parameters than the original one.

Example:

  %from_src = "FHE.change_partition"(%eint) {src = #FHE.partition<name "tfhers", lwe_dim 761, glwe_dim 1, poly_size 2048, pbs_base_log 23, pbs_level 1>} : (!FHE.eint<16>) -> (!FHE.eint<16>)
  %to_dest = "FHE.change_partition"(%eint) {dest = #FHE.partition<name "tfhers", lwe_dim 761, glwe_dim 1, poly_size 2048, pbs_base_log 23, pbs_level 1>} : (!FHE.eint<16>) -> (!FHE.eint<16>)

Traits: AlwaysSpeculatableImplTrait

Interfaces: ConditionallySpeculatable, NoMemoryEffect (MemoryEffectOpInterface), UnaryEint

Effects: MemoryEffects::Effect{}

Attributes:

Attribute
MLIR Type
Description

src

::mlir::concretelang::FHE::PartitionAttr

An attribute representing a partition.

dest

::mlir::concretelang::FHE::PartitionAttr

An attribute representing a partition.

Operands:

Operand
Description

input

Results:

Result
Description

«unnamed»

FHE.from_bool (::mlir::concretelang::FHE::FromBoolOp)

Cast a boolean to an unsigned integer

Examples:

"FHE.from_bool"(%x) : (!FHE.ebool) -> !FHE.eint<1>
"FHE.from_bool"(%x) : (!FHE.ebool) -> !FHE.eint<2>
"FHE.from_bool"(%x) : (!FHE.ebool) -> !FHE.eint<4>

Traits: AlwaysSpeculatableImplTrait

Interfaces: ConditionallySpeculatable, NoMemoryEffect (MemoryEffectOpInterface), UnaryEint

Effects: MemoryEffects::Effect{}

Operands:

Operand
Description

input

An encrypted boolean

Results:

Result
Description

«unnamed»

An encrypted unsigned integer

FHE.gen_gate (::mlir::concretelang::FHE::GenGateOp)

Applies a truth table based on two boolean inputs

Truth table must be a tensor of four boolean values.

Example:

// ok
"FHE.gen_gate"(%a, %b, %ttable): (!FHE.ebool, !FHE.ebool, tensor<4xi64>) -> (!FHE.ebool)

// error
"FHE.gen_gate"(%a, %b, %ttable): (!FHE.ebool, !FHE.ebool, tensor<7xi64>) -> (!FHE.ebool)

Traits: AlwaysSpeculatableImplTrait

Interfaces: ConditionallySpeculatable, NoMemoryEffect (MemoryEffectOpInterface)

Effects: MemoryEffects::Effect{}

Operands:

Operand
Description

left

An encrypted boolean

right

An encrypted boolean

truth_table

tensor of integer values

Results:

Result
Description

«unnamed»

An encrypted boolean

FHE.lsb (::mlir::concretelang::FHE::LsbEintOp)

Extract the lowest significant bit at a given precision.

This operation extracts the lsb of a ciphertext in a specific precision.

Extracting the lsb with the smallest precision:

 // Checking if even or odd
 %even = "FHE.lsb"(%a): (!FHE.eint<4>) -> (!FHE.eint<1>)

Usually when you extract the lsb bit, you also need to extract the next one.
In that case you first need to clear the first lsb of the input to be able to reduce its precision and extract the next one.
To be able to clear the lsb just extracted, you can get it in the original precision.

Example:
```mlir
 // Extracting the first lsb with original precision
 %lsb_0 = "FHE.lsb"(%input): (!FHE.eint<4>) -> (!FHE.eint<4>)
 // Clearing the first lsb from original input
 %input_lsb0_cleared = "FHE.sub_eint"(%input, %lsb_0) : (!FHE.eint<4>, !FHE.eint<4>) -> (!FHE.eint<4>)
 // Reducing the precision of the input
 %input_3b = "FHE.reinterpret_precision(%input) : (!FHE.eint<4>) -> !FHE.eint<3>
 // Now, we can do it again with the second lsb
 %lsb_1 = "FHE.lsb"(%input_3b): (!FHE.eint<3>) -> (!FHE.eint<3>)
 ...
 // later if you need %b_lsb at same position as in the input
 %lsb_1_at_input_position = "FHE.reinterpret_precision(%b_lsb)" : (!FHE.eint<3>) -> !FHE.eint<4>
 // that way you can recombine the extracted bits
 %input_mod_4 = "FHE.add_eint"(%lsb_0, %lsb_1_at_input_position) : (!FHE.eint<4>, !FHE.eint<4>) -> (!FHE.eint<4>)

Traits: AlwaysSpeculatableImplTrait

Interfaces: ConditionallySpeculatable, ConstantNoise, NoMemoryEffect (MemoryEffectOpInterface), UnaryEint

Effects: MemoryEffects::Effect{}

Operands:

Operand
Description

input

Results:

Result
Description

«unnamed»

FHE.max_eint (::mlir::concretelang::FHE::MaxEintOp)

Retrieve the maximum of two encrypted integers.

Retrieve the maximum of two encrypted integers using the formula, 'max(x, y) == max(x - y, 0) + y'. The input and output types should be the same.

If `x - y`` inside the max overflows or underflows, the behavior is undefined. To support the full range, you should increase the bit-width by 1 manually.

Example:

// ok
"FHE.max_eint"(%x, %y) : (!FHE.eint<2>, !FHE.eint<2>) -> !FHE.eint<2>
"FHE.max_eint"(%x, %y) : (!FHE.esint<3>, !FHE.esint<3>) -> !FHE.esint<3>

// error
"FHE.max_eint"(%x, %y) : (!FHE.eint<2>, !FHE.eint<3>) -> !FHE.eint<2>
"FHE.max_eint"(%x, %y) : (!FHE.eint<2>, !FHE.eint<2>) -> !FHE.esint<2>
"FHE.max_eint"(%x, %y) : (!FHE.esint<2>, !FHE.eint<2>) -> !FHE.eint<2>

Traits: AlwaysSpeculatableImplTrait

Interfaces: BinaryEint, ConditionallySpeculatable, NoMemoryEffect (MemoryEffectOpInterface)

Effects: MemoryEffects::Effect{}

Operands:

Operand
Description

x

y

Results:

Result
Description

«unnamed»

FHE.mul_eint_int (::mlir::concretelang::FHE::MulEintIntOp)

Multiply an encrypted integer with a clear integer

The clear integer must have one more bit than the encrypted integer and the result must have the same width and the same signedness as the encrypted integer.

Example:

// ok
"FHE.mul_eint_int"(%a, %i) : (!FHE.eint<2>, i3) -> !FHE.eint<2>
"FHE.mul_eint_int"(%a, %i) : (!FHE.esint<2>, i3) -> !FHE.esint<2>

// error
"FHE.mul_eint_int"(%a, %i) : (!FHE.eint<2>, i4) -> !FHE.eint<2>
"FHE.mul_eint_int"(%a, %i) : (!FHE.eint<2>, i3) -> !FHE.eint<3>
"FHE.mul_eint_int"(%a, %i) : (!FHE.eint<2>, i3) -> !FHE.esint<2>

Traits: AlwaysSpeculatableImplTrait

Interfaces: Binary, BinaryEintInt, ConditionallySpeculatable, NoMemoryEffect (MemoryEffectOpInterface)

Effects: MemoryEffects::Effect{}

Operands:

Operand
Description

a

b

integer

Results:

Result
Description

«unnamed»

FHE.mul_eint (::mlir::concretelang::FHE::MulEintOp)

Multiplies two encrypted integers

The encrypted integers and the result must have the same width and signedness. Also, due to the current implementation, one supplementary bit of width must be provided, in addition to the number of bits needed to encode the largest output value.

Example:

// ok
"FHE.mul_eint"(%a, %b): (!FHE.eint<2>, !FHE.eint<2>) -> (!FHE.eint<2>)
"FHE.mul_eint"(%a, %b): (!FHE.eint<3>, !FHE.eint<3>) -> (!FHE.eint<3>)
"FHE.mul_eint"(%a, %b): (!FHE.esint<3>, !FHE.esint<3>) -> (!FHE.esint<3>)

// error
"FHE.mul_eint"(%a, %b): (!FHE.eint<2>, !FHE.eint<3>) -> (!FHE.eint<2>)
"FHE.mul_eint"(%a, %b): (!FHE.eint<2>, !FHE.eint<2>) -> (!FHE.eint<3>)
"FHE.mul_eint"(%a, %b): (!FHE.eint<2>, !FHE.eint<2>) -> (!FHE.esint<2>)
"FHE.mul_eint"(%a, %b): (!FHE.esint<2>, !FHE.eint<2>) -> (!FHE.eint<2>)

Traits: AlwaysSpeculatableImplTrait

Interfaces: BinaryEint, ConditionallySpeculatable, NoMemoryEffect (MemoryEffectOpInterface)

Effects: MemoryEffects::Effect{}

Operands:

Operand
Description

rhs

lhs

Results:

Result
Description

«unnamed»

FHE.mux (::mlir::concretelang::FHE::MuxOp)

Multiplexer for two encrypted boolean inputs, based on an encrypted condition

Example:

"FHE.mux"(%cond, %c1, %c2): (!FHE.ebool, !FHE.ebool, !FHE.ebool) -> (!FHE.ebool)

Traits: AlwaysSpeculatableImplTrait

Interfaces: ConditionallySpeculatable, NoMemoryEffect (MemoryEffectOpInterface)

Effects: MemoryEffects::Effect{}

Operands:

Operand
Description

cond

An encrypted boolean

c1

An encrypted boolean

c2

An encrypted boolean

Results:

Result
Description

«unnamed»

An encrypted boolean

FHE.neg_eint (::mlir::concretelang::FHE::NegEintOp)

Negates an encrypted integer

The result must have the same width and the same signedness as the encrypted integer.

Example:

// ok
"FHE.neg_eint"(%a): (!FHE.eint<2>) -> (!FHE.eint<2>)
"FHE.neg_eint"(%a): (!FHE.esint<2>) -> (!FHE.esint<2>)

// error
"FHE.neg_eint"(%a): (!FHE.eint<2>) -> (!FHE.eint<3>)
"FHE.neg_eint"(%a): (!FHE.eint<2>) -> (!FHE.esint<2>)

Traits: AlwaysSpeculatableImplTrait

Interfaces: AdditiveNoise, ConditionallySpeculatable, NoMemoryEffect (MemoryEffectOpInterface), UnaryEint

Effects: MemoryEffects::Effect{}

Operands:

Operand
Description

a

Results:

Result
Description

«unnamed»

FHE.reinterpret_precision (::mlir::concretelang::FHE::ReinterpretPrecisionEintOp)

Reinterpret the ciphertext with a different precision.

Changing the precision of a ciphertext. It changes both the precision, the value, and in certain cases the correctness of the ciphertext.

Changing to - a bigger precision is always safe. This is equivalent to a shift left for the value. - a smaller precision is only safe if you clear the lowest bits that are discarded. If not, you can assume small errors on the next TLU. This is equivalent to a shift right for the value.

Example:

 // assuming %a is stored as 4bits but can be stored with only 2bits
 // we can reduce its storage precision
 %shifted_a = "FHE.mul_eint_int"(%a, %c_4): (!FHE.eint<4>) -> (!FHE.eint<4>)
 %a_small_precision = "FHE.reinterpret_precision"(%shifted_a, %lsb) : (!FHE.eint<4>) -> (!FHE.eint<2>)

Traits: AlwaysSpeculatableImplTrait

Interfaces: ConditionallySpeculatable, NoMemoryEffect (MemoryEffectOpInterface), UnaryEint

Effects: MemoryEffects::Effect{}

Operands:

Operand
Description

input

Results:

Result
Description

«unnamed»

FHE.round (::mlir::concretelang::FHE::RoundEintOp)

Rounds a ciphertext to a smaller precision.

Assuming a ciphertext whose message is implemented over p bits, this operation rounds it to fit to q bits with p>q.

Example:

 // ok
 "FHE.round"(%a): (!FHE.eint<6>) -> (!FHE.eint<5>)
 "FHE.round"(%a): (!FHE.eint<5>) -> (!FHE.eint<3>)
 "FHE.round"(%a): (!FHE.eint<3>) -> (!FHE.eint<2>)
 "FHE.round"(%a): (!FHE.esint<3>) -> (!FHE.esint<2>)

// error
 "FHE.round"(%a): (!FHE.eint<6>) -> (!FHE.eint<6>)
 "FHE.round"(%a): (!FHE.eint<4>) -> (!FHE.eint<5>)
 "FHE.round"(%a): (!FHE.eint<4>) -> (!FHE.esint<5>)

Traits: AlwaysSpeculatableImplTrait

Interfaces: ConditionallySpeculatable, NoMemoryEffect (MemoryEffectOpInterface), UnaryEint

Effects: MemoryEffects::Effect{}

Operands:

Operand
Description

input

Results:

Result
Description

«unnamed»

FHE.sub_eint_int (::mlir::concretelang::FHE::SubEintIntOp)

Subtract a clear integer from an encrypted integer

The clear integer must have one more bit than the encrypted integer and the result must have the same width and the same signedness as the encrypted integer.

Example:

// ok
"FHE.sub_eint_int"(%i, %a) : (!FHE.eint<2>, i3) -> !FHE.eint<2>
"FHE.sub_eint_int"(%i, %a) : (!FHE.esint<2>, i3) -> !FHE.esint<2>

// error
"FHE.sub_eint_int"(%i, %a) : (!FHE.eint<2>, i4) -> !FHE.eint<2>
"FHE.sub_eint_int"(%i, %a) : (!FHE.eint<2>, i3) -> !FHE.eint<3>
"FHE.sub_eint_int"(%i, %a) : (!FHE.eint<2>, i3) -> !FHE.esint<2>

Traits: AlwaysSpeculatableImplTrait

Interfaces: AdditiveNoise, Binary, BinaryEintInt, ConditionallySpeculatable, NoMemoryEffect (MemoryEffectOpInterface)

Effects: MemoryEffects::Effect{}

Operands:

Operand
Description

a

b

integer

Results:

Result
Description

«unnamed»

FHE.sub_eint (::mlir::concretelang::FHE::SubEintOp)

Subtract an encrypted integer from an encrypted integer

The encrypted integers and the result must have the same width and the same signedness.

Example:

// ok
"FHE.sub_eint"(%a, %b): (!FHE.eint<2>, !FHE.eint<2>) -> (!FHE.eint<2>)
"FHE.sub_eint"(%a, %b): (!FHE.esint<2>, !FHE.esint<2>) -> (!FHE.esint<2>)

// error
"FHE.sub_eint"(%a, %b): (!FHE.eint<2>, !FHE.eint<3>) -> (!FHE.eint<2>)
"FHE.sub_eint"(%a, %b): (!FHE.eint<2>, !FHE.eint<2>) -> (!FHE.eint<3>)
"FHE.sub_eint"(%a, %b): (!FHE.eint<2>, !FHE.esint<2>) -> (!FHE.esint<2>)
"FHE.sub_eint"(%a, %b): (!FHE.eint<2>, !FHE.eint<2>) -> (!FHE.esint<2>)

Traits: AlwaysSpeculatableImplTrait

Interfaces: AdditiveNoise, BinaryEint, ConditionallySpeculatable, NoMemoryEffect (MemoryEffectOpInterface)

Effects: MemoryEffects::Effect{}

Operands:

Operand
Description

a

b

Results:

Result
Description

«unnamed»

FHE.sub_int_eint (::mlir::concretelang::FHE::SubIntEintOp)

Subtract an encrypted integer from a clear integer

The clear integer must have one more bit than the encrypted integer and the result must have the same width and the same signedness as the encrypted integer.

Example:

// ok
"FHE.sub_int_eint"(%i, %a) : (i3, !FHE.eint<2>) -> !FHE.eint<2>
"FHE.sub_int_eint"(%i, %a) : (i3, !FHE.esint<2>) -> !FHE.esint<2>

// error
"FHE.sub_int_eint"(%i, %a) : (i4, !FHE.eint<2>) -> !FHE.eint<2>
"FHE.sub_int_eint"(%i, %a) : (i3, !FHE.eint<2>) -> !FHE.eint<3>
"FHE.sub_int_eint"(%i, %a) : (i3, !FHE.eint<2>) -> !FHE.esint<2>

Traits: AlwaysSpeculatableImplTrait

Interfaces: AdditiveNoise, Binary, BinaryIntEint, ConditionallySpeculatable, NoMemoryEffect (MemoryEffectOpInterface)

Effects: MemoryEffects::Effect{}

Operands:

Operand
Description

a

integer

b

Results:

Result
Description

«unnamed»

FHE.to_bool (::mlir::concretelang::FHE::ToBoolOp)

Cast an unsigned integer to a boolean

The input must be of width one or two. Two being the current representation of an encrypted boolean, leaving one bit for the carry.

Examples:

// ok
"FHE.to_bool"(%x) : (!FHE.eint<1>) -> !FHE.ebool
"FHE.to_bool"(%x) : (!FHE.eint<2>) -> !FHE.ebool

// error
"FHE.to_bool"(%x) : (!FHE.eint<3>) -> !FHE.ebool

Traits: AlwaysSpeculatableImplTrait

Interfaces: ConditionallySpeculatable, NoMemoryEffect (MemoryEffectOpInterface), UnaryEint

Effects: MemoryEffects::Effect{}

Operands:

Operand
Description

input

An encrypted unsigned integer

Results:

Result
Description

«unnamed»

An encrypted boolean

FHE.to_signed (::mlir::concretelang::FHE::ToSignedOp)

Cast an unsigned integer to a signed one

The result must have the same width as the input.

The behavior is undefined on overflow/underflow.

Examples:

// ok
"FHE.to_signed"(%x) : (!FHE.eint<2>) -> !FHE.esint<2>

// error
"FHE.to_signed"(%x) : (!FHE.eint<2>) -> !FHE.esint<3>

Traits: AlwaysSpeculatableImplTrait

Interfaces: ConditionallySpeculatable, NoMemoryEffect (MemoryEffectOpInterface), UnaryEint

Effects: MemoryEffects::Effect{}

Operands:

Operand
Description

input

An encrypted unsigned integer

Results:

Result
Description

«unnamed»

An encrypted signed integer

FHE.to_unsigned (::mlir::concretelang::FHE::ToUnsignedOp)

Cast a signed integer to an unsigned one

The result must have the same width as the input.

The behavior is undefined on overflow/underflow.

Examples:

// ok
"FHE.to_unsigned"(%x) : (!FHE.esint<2>) -> !FHE.eint<2>

// error
"FHE.to_unsigned"(%x) : (!FHE.esint<2>) -> !FHE.eint<3>

Traits: AlwaysSpeculatableImplTrait

Interfaces: ConditionallySpeculatable, NoMemoryEffect (MemoryEffectOpInterface), UnaryEint

Effects: MemoryEffects::Effect{}

Operands:

Operand
Description

input

An encrypted signed integer

Results:

Result
Description

«unnamed»

An encrypted unsigned integer

FHE.zero (::mlir::concretelang::FHE::ZeroEintOp)

Returns a trivial encrypted integer of 0

Example:

"FHE.zero"() : () -> !FHE.eint<2>
"FHE.zero"() : () -> !FHE.esint<2>

Traits: AlwaysSpeculatableImplTrait

Interfaces: ConditionallySpeculatable, NoMemoryEffect (MemoryEffectOpInterface), ZeroNoise

Effects: MemoryEffects::Effect{}

Results:

Result
Description

out

FHE.zero_tensor (::mlir::concretelang::FHE::ZeroTensorOp)

Creates a new tensor with all elements initialized to an encrypted zero.

Creates a new tensor with the shape specified in the result type and initializes its elements with an encrypted zero.

Example:

%tensor = "FHE.zero_tensor"() : () -> tensor<5x!FHE.eint<4>>
%tensor = "FHE.zero_tensor"() : () -> tensor<5x!FHE.esint<4>>

Traits: AlwaysSpeculatableImplTrait

Interfaces: ConditionallySpeculatable, NoMemoryEffect (MemoryEffectOpInterface), ZeroNoise

Effects: MemoryEffects::Effect{}

Results:

Result
Description

tensor

Attribute definition

PartitionAttr

An attribute representing a partition.

Syntax:

#FHE.partition<
  StringAttr,   # name
  uint64_t,   # lweDim
  uint64_t,   # glweDim
  uint64_t,   # polySize
  uint64_t,   # pbsBaseLog
  uint64_t   # pbsLevel
>

Parameters:

Parameter
C++ type
Description

name

StringAttr

lweDim

uint64_t

glweDim

uint64_t

polySize

uint64_t

pbsBaseLog

uint64_t

pbsLevel

uint64_t

Type definition

EncryptedBooleanType

An encrypted boolean

Syntax: !FHE.ebool

An encrypted boolean.

EncryptedSignedIntegerType

An encrypted signed integer

An encrypted signed integer with width bits to performs FHE Operations.

Examples:

!FHE.esint<7>
!FHE.esint<6>

Parameters:

Parameter
C++ type
Description

width

unsigned

EncryptedUnsignedIntegerType

An encrypted unsigned integer

An encrypted unsigned integer with width bits to performs FHE Operations.

Examples:

!FHE.eint<7>
!FHE.eint<6>

Parameters:

Parameter
C++ type
Description

width

unsigned

Reusing arguments

This document explains how to reuse encrypted arguments in applications where the same arguments are used repeatedly.

Encrypting data can be resource-intensive, especially when the same argument or set of arguments is used multiple times. In such cases, it’s inefficient to encrypt and transfer the arguments repeatedly. Instead, you can encrypt the arguments separately and reuse them as needed. By encrypting the arguments once and reusing them, you can optimize performance by reducing encryption time, memory usage, and network bandwidth.

Here is an example:

from concrete import fhe

@fhe.compiler({"x": "encrypted", "y": "encrypted"})
def add(x, y):
    return x + y

inputset = [(2, 3), (0, 0), (1, 6), (7, 7), (7, 1), (3, 2), (6, 1), (1, 7), (4, 5), (5, 4)]
circuit = add.compile(inputset)

sample_y = 4
_, encrypted_y = circuit.encrypt(None, sample_y)

for sample_x in range(3, 6):
    encrypted_x, _ = circuit.encrypt(sample_x, None)

    encrypted_result = circuit.run(encrypted_x, encrypted_y)
    result = circuit.decrypt(encrypted_result)

    assert result == sample_x + sample_y

Note when you use encrypt method:

  • If you have multiple arguments, the encrypt method would return a tuple.

  • If you specify None as one of the arguments, None is placed at the same location in the resulting tuple.

    • For example, circuit.encrypt(a, None, b, c, None) returns (encrypted_a, None, encrypted_b, encrypted_c, None).

  • Each value returned by encrypt can be stored and reused anytime.

The order of arguments must be consistent when encrypting and using them. Encrypting an x and using it as a y could result in undefined behavior.

Concrete dialect

Low Level Fully Homomorphic Encryption dialect A dialect for representation of low level operation on fully homomorphic ciphertext.

Operation definition

Concrete.add_lwe_buffer (::mlir::concretelang::Concrete::AddLweBufferOp)

Returns the sum of 2 lwe ciphertexts

Operands:

Operand
Description

result

1D memref of 64-bit signless integer values

lhs

1D memref of 64-bit signless integer values

rhs

1D memref of 64-bit signless integer values

Concrete.add_lwe_tensor (::mlir::concretelang::Concrete::AddLweTensorOp)

Returns the sum of 2 lwe ciphertexts

Traits: AlwaysSpeculatableImplTrait

Interfaces: ConditionallySpeculatable, NoMemoryEffect (MemoryEffectOpInterface)

Effects: MemoryEffects::Effect{}

Operands:

Operand
Description

lhs

1D tensor of 64-bit signless integer values

rhs

1D tensor of 64-bit signless integer values

Results:

Result
Description

result

1D tensor of 64-bit signless integer values

Concrete.add_plaintext_lwe_buffer (::mlir::concretelang::Concrete::AddPlaintextLweBufferOp)

Returns the sum of a clear integer and an lwe ciphertext

Operands:

Operand
Description

result

1D memref of 64-bit signless integer values

lhs

1D memref of 64-bit signless integer values

rhs

64-bit signless integer

Concrete.add_plaintext_lwe_tensor (::mlir::concretelang::Concrete::AddPlaintextLweTensorOp)

Returns the sum of a clear integer and an lwe ciphertext

Traits: AlwaysSpeculatableImplTrait

Interfaces: ConditionallySpeculatable, NoMemoryEffect (MemoryEffectOpInterface)

Effects: MemoryEffects::Effect{}

Operands:

Operand
Description

lhs

1D tensor of 64-bit signless integer values

rhs

64-bit signless integer

Results:

Result
Description

result

1D tensor of 64-bit signless integer values

Concrete.batched_add_lwe_buffer (::mlir::concretelang::Concrete::BatchedAddLweBufferOp)

Batched version of AddLweBufferOp, which performs the same operation on multiple elements

Operands:

Operand
Description

result

2D memref of 64-bit signless integer values

lhs

2D memref of 64-bit signless integer values

rhs

2D memref of 64-bit signless integer values

Concrete.batched_add_lwe_tensor (::mlir::concretelang::Concrete::BatchedAddLweTensorOp)

Batched version of AddLweTensorOp, which performs the same operation on multiple elements

Traits: AlwaysSpeculatableImplTrait

Interfaces: ConditionallySpeculatable, NoMemoryEffect (MemoryEffectOpInterface)

Effects: MemoryEffects::Effect{}

Operands:

Operand
Description

lhs

2D tensor of 64-bit signless integer values

rhs

2D tensor of 64-bit signless integer values

Results:

Result
Description

result

2D tensor of 64-bit signless integer values

Concrete.batched_add_plaintext_cst_lwe_buffer (::mlir::concretelang::Concrete::BatchedAddPlaintextCstLweBufferOp)

Batched version of AddPlaintextLweBufferOp, which performs the same operation on multiple elements

Operands:

Operand
Description

result

2D memref of 64-bit signless integer values

lhs

2D memref of 64-bit signless integer values

rhs

64-bit signless integer

Concrete.batched_add_plaintext_cst_lwe_tensor (::mlir::concretelang::Concrete::BatchedAddPlaintextCstLweTensorOp)

Batched version of AddPlaintextLweTensorOp, which performs the same operation on multiple elements

Traits: AlwaysSpeculatableImplTrait

Interfaces: ConditionallySpeculatable, NoMemoryEffect (MemoryEffectOpInterface)

Effects: MemoryEffects::Effect{}

Operands:

Operand
Description

lhs

2D tensor of 64-bit signless integer values

rhs

64-bit signless integer

Results:

Result
Description

result

2D tensor of 64-bit signless integer values

Concrete.batched_add_plaintext_lwe_buffer (::mlir::concretelang::Concrete::BatchedAddPlaintextLweBufferOp)

Batched version of AddPlaintextLweBufferOp, which performs the same operation on multiple elements

Operands:

Operand
Description

result

2D memref of 64-bit signless integer values

lhs

2D memref of 64-bit signless integer values

rhs

1D memref of 64-bit signless integer values

Concrete.batched_add_plaintext_lwe_tensor (::mlir::concretelang::Concrete::BatchedAddPlaintextLweTensorOp)

Batched version of AddPlaintextLweTensorOp, which performs the same operation on multiple elements

Traits: AlwaysSpeculatableImplTrait

Interfaces: ConditionallySpeculatable, NoMemoryEffect (MemoryEffectOpInterface)

Effects: MemoryEffects::Effect{}

Operands:

Operand
Description

lhs

2D tensor of 64-bit signless integer values

rhs

1D tensor of 64-bit signless integer values

Results:

Result
Description

result

2D tensor of 64-bit signless integer values

Concrete.batched_bootstrap_lwe_buffer (::mlir::concretelang::Concrete::BatchedBootstrapLweBufferOp)

Batched version of BootstrapLweOp, which performs the same operation on multiple elements

Attributes:

Attribute
MLIR Type
Description

inputLweDim

::mlir::IntegerAttr

32-bit signless integer attribute

polySize

::mlir::IntegerAttr

32-bit signless integer attribute

level

::mlir::IntegerAttr

32-bit signless integer attribute

baseLog

::mlir::IntegerAttr

32-bit signless integer attribute

glweDimension

::mlir::IntegerAttr

32-bit signless integer attribute

bskIndex

::mlir::IntegerAttr

32-bit signless integer attribute

Operands:

Operand
Description

result

2D memref of 64-bit signless integer values

input_ciphertext

2D memref of 64-bit signless integer values

lookup_table

1D memref of 64-bit signless integer values

Concrete.batched_bootstrap_lwe_tensor (::mlir::concretelang::Concrete::BatchedBootstrapLweTensorOp)

Batched version of BootstrapLweOp, which performs the same operation on multiple elements

Traits: AlwaysSpeculatableImplTrait

Interfaces: ConditionallySpeculatable, NoMemoryEffect (MemoryEffectOpInterface)

Effects: MemoryEffects::Effect{}

Attributes:

Attribute
MLIR Type
Description

inputLweDim

::mlir::IntegerAttr

32-bit signless integer attribute

polySize

::mlir::IntegerAttr

32-bit signless integer attribute

level

::mlir::IntegerAttr

32-bit signless integer attribute

baseLog

::mlir::IntegerAttr

32-bit signless integer attribute

glweDimension

::mlir::IntegerAttr

32-bit signless integer attribute

bskIndex

::mlir::IntegerAttr

32-bit signless integer attribute

Operands:

Operand
Description

input_ciphertext

2D tensor of 64-bit signless integer values

lookup_table

1D tensor of 64-bit signless integer values

Results:

Result
Description

result

2D tensor of 64-bit signless integer values

Concrete.batched_keyswitch_lwe_buffer (::mlir::concretelang::Concrete::BatchedKeySwitchLweBufferOp)

Batched version of KeySwitchLweOp, which performs the same operation on multiple elements

Attributes:

Attribute
MLIR Type
Description

level

::mlir::IntegerAttr

32-bit signless integer attribute

baseLog

::mlir::IntegerAttr

32-bit signless integer attribute

lwe_dim_in

::mlir::IntegerAttr

32-bit signless integer attribute

lwe_dim_out

::mlir::IntegerAttr

32-bit signless integer attribute

kskIndex

::mlir::IntegerAttr

32-bit signless integer attribute

Operands:

Operand
Description

result

2D memref of 64-bit signless integer values

ciphertext

2D memref of 64-bit signless integer values

Concrete.batched_keyswitch_lwe_tensor (::mlir::concretelang::Concrete::BatchedKeySwitchLweTensorOp)

Batched version of KeySwitchLweOp, which performs the same operation on multiple elements

Traits: AlwaysSpeculatableImplTrait

Interfaces: ConditionallySpeculatable, NoMemoryEffect (MemoryEffectOpInterface)

Effects: MemoryEffects::Effect{}

Attributes:

Attribute
MLIR Type
Description

level

::mlir::IntegerAttr

32-bit signless integer attribute

baseLog

::mlir::IntegerAttr

32-bit signless integer attribute

lwe_dim_in

::mlir::IntegerAttr

32-bit signless integer attribute

lwe_dim_out

::mlir::IntegerAttr

32-bit signless integer attribute

kskIndex

::mlir::IntegerAttr

32-bit signless integer attribute

Operands:

Operand
Description

ciphertext

2D tensor of 64-bit signless integer values

Results:

Result
Description

result

2D tensor of 64-bit signless integer values

Concrete.batched_mapped_bootstrap_lwe_buffer (::mlir::concretelang::Concrete::BatchedMappedBootstrapLweBufferOp)

Batched, mapped version of BootstrapLweOp, which performs the same operation on multiple elements

Attributes:

Attribute
MLIR Type
Description

inputLweDim

::mlir::IntegerAttr

32-bit signless integer attribute

polySize

::mlir::IntegerAttr

32-bit signless integer attribute

level

::mlir::IntegerAttr

32-bit signless integer attribute

baseLog

::mlir::IntegerAttr

32-bit signless integer attribute

glweDimension

::mlir::IntegerAttr

32-bit signless integer attribute

bskIndex

::mlir::IntegerAttr

32-bit signless integer attribute

Operands:

Operand
Description

result

2D memref of 64-bit signless integer values

input_ciphertext

2D memref of 64-bit signless integer values

lookup_table_vector

2D memref of 64-bit signless integer values

Concrete.batched_mapped_bootstrap_lwe_tensor (::mlir::concretelang::Concrete::BatchedMappedBootstrapLweTensorOp)

Batched, mapped version of BootstrapLweOp, which performs the same operation on multiple elements

Traits: AlwaysSpeculatableImplTrait

Interfaces: ConditionallySpeculatable, NoMemoryEffect (MemoryEffectOpInterface)

Effects: MemoryEffects::Effect{}

Attributes:

Attribute
MLIR Type
Description

inputLweDim

::mlir::IntegerAttr

32-bit signless integer attribute

polySize

::mlir::IntegerAttr

32-bit signless integer attribute

level

::mlir::IntegerAttr

32-bit signless integer attribute

baseLog

::mlir::IntegerAttr

32-bit signless integer attribute

glweDimension

::mlir::IntegerAttr

32-bit signless integer attribute

bskIndex

::mlir::IntegerAttr

32-bit signless integer attribute

Operands:

Operand
Description

input_ciphertext

2D tensor of 64-bit signless integer values

lookup_table_vector

2D tensor of 64-bit signless integer values

Results:

Result
Description

result

2D tensor of 64-bit signless integer values

Concrete.batched_mul_cleartext_cst_lwe_buffer (::mlir::concretelang::Concrete::BatchedMulCleartextCstLweBufferOp)

Batched version of MulCleartextLweBufferOp, which performs the same operation on multiple elements

Operands:

Operand
Description

result

2D memref of 64-bit signless integer values

lhs

2D memref of 64-bit signless integer values

rhs

64-bit signless integer

Concrete.batched_mul_cleartext_cst_lwe_tensor (::mlir::concretelang::Concrete::BatchedMulCleartextCstLweTensorOp)

Batched version of MulCleartextLweTensorOp, which performs the same operation on multiple elements

Traits: AlwaysSpeculatableImplTrait

Interfaces: ConditionallySpeculatable, NoMemoryEffect (MemoryEffectOpInterface)

Effects: MemoryEffects::Effect{}

Operands:

Operand
Description

lhs

2D tensor of 64-bit signless integer values

rhs

64-bit signless integer

Results:

Result
Description

result

2D tensor of 64-bit signless integer values

Concrete.batched_mul_cleartext_lwe_buffer (::mlir::concretelang::Concrete::BatchedMulCleartextLweBufferOp)

Batched version of MulCleartextLweBufferOp, which performs the same operation on multiple elements

Operands:

Operand
Description

result

2D memref of 64-bit signless integer values

lhs

2D memref of 64-bit signless integer values

rhs

1D memref of 64-bit signless integer values

Concrete.batched_mul_cleartext_lwe_tensor (::mlir::concretelang::Concrete::BatchedMulCleartextLweTensorOp)

Batched version of MulCleartextLweTensorOp, which performs the same operation on multiple elements

Traits: AlwaysSpeculatableImplTrait

Interfaces: ConditionallySpeculatable, NoMemoryEffect (MemoryEffectOpInterface)

Effects: MemoryEffects::Effect{}

Operands:

Operand
Description

lhs

2D tensor of 64-bit signless integer values

rhs

1D tensor of 64-bit signless integer values

Results:

Result
Description

result

2D tensor of 64-bit signless integer values

Concrete.batched_negate_lwe_buffer (::mlir::concretelang::Concrete::BatchedNegateLweBufferOp)

Batched version of NegateLweBufferOp, which performs the same operation on multiple elements

Operands:

Operand
Description

result

2D memref of 64-bit signless integer values

ciphertext

2D memref of 64-bit signless integer values

Concrete.batched_negate_lwe_tensor (::mlir::concretelang::Concrete::BatchedNegateLweTensorOp)

Batched version of NegateLweTensorOp, which performs the same operation on multiple elements

Traits: AlwaysSpeculatableImplTrait

Interfaces: ConditionallySpeculatable, NoMemoryEffect (MemoryEffectOpInterface)

Effects: MemoryEffects::Effect{}

Operands:

Operand
Description

ciphertext

2D tensor of 64-bit signless integer values

Results:

Result
Description

result

2D tensor of 64-bit signless integer values

Concrete.bootstrap_lwe_buffer (::mlir::concretelang::Concrete::BootstrapLweBufferOp)

Bootstraps a LWE ciphertext with a GLWE trivial encryption of the lookup table

Attributes:

Attribute
MLIR Type
Description

inputLweDim

::mlir::IntegerAttr

32-bit signless integer attribute

polySize

::mlir::IntegerAttr

32-bit signless integer attribute

level

::mlir::IntegerAttr

32-bit signless integer attribute

baseLog

::mlir::IntegerAttr

32-bit signless integer attribute

glweDimension

::mlir::IntegerAttr

32-bit signless integer attribute

bskIndex

::mlir::IntegerAttr

32-bit signless integer attribute

Operands:

Operand
Description

result

1D memref of 64-bit signless integer values

input_ciphertext

1D memref of 64-bit signless integer values

lookup_table

1D memref of 64-bit signless integer values

Concrete.bootstrap_lwe_tensor (::mlir::concretelang::Concrete::BootstrapLweTensorOp)

Bootstraps an LWE ciphertext with a GLWE trivial encryption of the lookup table

Traits: AlwaysSpeculatableImplTrait

Interfaces: ConditionallySpeculatable, NoMemoryEffect (MemoryEffectOpInterface)

Effects: MemoryEffects::Effect{}

Attributes:

Attribute
MLIR Type
Description

inputLweDim

::mlir::IntegerAttr

32-bit signless integer attribute

polySize

::mlir::IntegerAttr

32-bit signless integer attribute

level

::mlir::IntegerAttr

32-bit signless integer attribute

baseLog

::mlir::IntegerAttr

32-bit signless integer attribute

glweDimension

::mlir::IntegerAttr

32-bit signless integer attribute

bskIndex

::mlir::IntegerAttr

32-bit signless integer attribute

Operands:

Operand
Description

input_ciphertext

1D tensor of 64-bit signless integer values

lookup_table

1D tensor of 64-bit signless integer values

Results:

Result
Description

result

1D tensor of 64-bit signless integer values

Concrete.encode_expand_lut_for_bootstrap_buffer (::mlir::concretelang::Concrete::EncodeExpandLutForBootstrapBufferOp)

Encode and expand a lookup table so that it can be used for a bootstrap

Attributes:

Attribute
MLIR Type
Description

polySize

::mlir::IntegerAttr

32-bit signless integer attribute

outputBits

::mlir::IntegerAttr

32-bit signless integer attribute

isSigned

::mlir::BoolAttr

bool attribute

Operands:

Operand
Description

result

1D memref of 64-bit signless integer values

input_lookup_table

1D memref of 64-bit signless integer values

Concrete.encode_expand_lut_for_bootstrap_tensor (::mlir::concretelang::Concrete::EncodeExpandLutForBootstrapTensorOp)

Encode and expand a lookup table so that it can be used for a bootstrap

Traits: AlwaysSpeculatableImplTrait

Interfaces: ConditionallySpeculatable, NoMemoryEffect (MemoryEffectOpInterface)

Effects: MemoryEffects::Effect{}

Attributes:

Attribute
MLIR Type
Description

polySize

::mlir::IntegerAttr

32-bit signless integer attribute

outputBits

::mlir::IntegerAttr

32-bit signless integer attribute

isSigned

::mlir::BoolAttr

bool attribute

Operands:

Operand
Description

input_lookup_table

1D tensor of 64-bit signless integer values

Results:

Result
Description

result

1D tensor of 64-bit signless integer values

Concrete.encode_lut_for_crt_woppbs_buffer (::mlir::concretelang::Concrete::EncodeLutForCrtWopPBSBufferOp)

Encode and expand a lookup table so that it can be used for a crt wop pbs

Attributes:

Attribute
MLIR Type
Description

crtDecomposition

::mlir::ArrayAttr

64-bit integer array attribute

crtBits

::mlir::ArrayAttr

64-bit integer array attribute

modulusProduct

::mlir::IntegerAttr

32-bit signless integer attribute

isSigned

::mlir::BoolAttr

bool attribute

Operands:

Operand
Description

result

2D memref of 64-bit signless integer values

input_lookup_table

1D memref of 64-bit signless integer values

Concrete.encode_lut_for_crt_woppbs_tensor (::mlir::concretelang::Concrete::EncodeLutForCrtWopPBSTensorOp)

Encode and expand a lookup table so that it can be used for a wop pbs

Traits: AlwaysSpeculatableImplTrait

Interfaces: ConditionallySpeculatable, NoMemoryEffect (MemoryEffectOpInterface)

Effects: MemoryEffects::Effect{}

Attributes:

Attribute
MLIR Type
Description

crtDecomposition

::mlir::ArrayAttr

64-bit integer array attribute

crtBits

::mlir::ArrayAttr

64-bit integer array attribute

modulusProduct

::mlir::IntegerAttr

32-bit signless integer attribute

isSigned

::mlir::BoolAttr

bool attribute

Operands:

Operand
Description

input_lookup_table

1D tensor of 64-bit signless integer values

Results:

Result
Description

result

2D tensor of 64-bit signless integer values

Concrete.encode_plaintext_with_crt_buffer (::mlir::concretelang::Concrete::EncodePlaintextWithCrtBufferOp)

Encodes a plaintext by decomposing it on a crt basis

Attributes:

Attribute
MLIR Type
Description

mods

::mlir::ArrayAttr

64-bit integer array attribute

modsProd

::mlir::IntegerAttr

64-bit signless integer attribute

Operands:

Operand
Description

result

1D memref of 64-bit signless integer values

input

64-bit signless integer

Concrete.encode_plaintext_with_crt_tensor (::mlir::concretelang::Concrete::EncodePlaintextWithCrtTensorOp)

Encodes a plaintext by decomposing it on a crt basis

Traits: AlwaysSpeculatableImplTrait

Interfaces: ConditionallySpeculatable, NoMemoryEffect (MemoryEffectOpInterface)

Effects: MemoryEffects::Effect{}

Attributes:

Attribute
MLIR Type
Description

mods

::mlir::ArrayAttr

64-bit integer array attribute

modsProd

::mlir::IntegerAttr

64-bit signless integer attribute

Operands:

Operand
Description

input

64-bit signless integer

Results:

Result
Description

result

1D tensor of 64-bit signless integer values

Concrete.keyswitch_lwe_buffer (::mlir::concretelang::Concrete::KeySwitchLweBufferOp)

Performs a keyswitching operation on an LWE ciphertext

Attributes:

Attribute
MLIR Type
Description

level

::mlir::IntegerAttr

32-bit signless integer attribute

baseLog

::mlir::IntegerAttr

32-bit signless integer attribute

lwe_dim_in

::mlir::IntegerAttr

32-bit signless integer attribute

lwe_dim_out

::mlir::IntegerAttr

32-bit signless integer attribute

kskIndex

::mlir::IntegerAttr

32-bit signless integer attribute

Operands:

Operand
Description

result

1D memref of 64-bit signless integer values

ciphertext

1D memref of 64-bit signless integer values

Concrete.keyswitch_lwe_tensor (::mlir::concretelang::Concrete::KeySwitchLweTensorOp)

Performs a keyswitching operation on an LWE ciphertext

Traits: AlwaysSpeculatableImplTrait

Interfaces: ConditionallySpeculatable, NoMemoryEffect (MemoryEffectOpInterface)

Effects: MemoryEffects::Effect{}

Attributes:

Attribute
MLIR Type
Description

level

::mlir::IntegerAttr

32-bit signless integer attribute

baseLog

::mlir::IntegerAttr

32-bit signless integer attribute

lwe_dim_in

::mlir::IntegerAttr

32-bit signless integer attribute

lwe_dim_out

::mlir::IntegerAttr

32-bit signless integer attribute

kskIndex

::mlir::IntegerAttr

32-bit signless integer attribute

Operands:

Operand
Description

ciphertext

1D tensor of 64-bit signless integer values

Results:

Result
Description

result

1D tensor of 64-bit signless integer values

Concrete.mul_cleartext_lwe_buffer (::mlir::concretelang::Concrete::MulCleartextLweBufferOp)

Returns the product of a clear integer and a lwe ciphertext

Operands:

Operand
Description

result

1D memref of 64-bit signless integer values

lhs

1D memref of 64-bit signless integer values

rhs

64-bit signless integer

Concrete.mul_cleartext_lwe_tensor (::mlir::concretelang::Concrete::MulCleartextLweTensorOp)

Returns the product of a clear integer and a lwe ciphertext

Traits: AlwaysSpeculatableImplTrait

Interfaces: ConditionallySpeculatable, NoMemoryEffect (MemoryEffectOpInterface)

Effects: MemoryEffects::Effect{}

Operands:

Operand
Description

lhs

1D tensor of 64-bit signless integer values

rhs

64-bit signless integer

Results:

Result
Description

result

1D tensor of 64-bit signless integer values

Concrete.negate_lwe_buffer (::mlir::concretelang::Concrete::NegateLweBufferOp)

Negates an lwe ciphertext

Operands:

Operand
Description

result

1D memref of 64-bit signless integer values

ciphertext

1D memref of 64-bit signless integer values

Concrete.negate_lwe_tensor (::mlir::concretelang::Concrete::NegateLweTensorOp)

Negates an lwe ciphertext

Traits: AlwaysSpeculatableImplTrait

Interfaces: ConditionallySpeculatable, NoMemoryEffect (MemoryEffectOpInterface)

Effects: MemoryEffects::Effect{}

Operands:

Operand
Description

ciphertext

1D tensor of 64-bit signless integer values

Results:

Result
Description

result

1D tensor of 64-bit signless integer values

Concrete.wop_pbs_crt_lwe_buffer (::mlir::concretelang::Concrete::WopPBSCRTLweBufferOp)

Attributes:

Attribute
MLIR Type
Description

bootstrapLevel

::mlir::IntegerAttr

32-bit signless integer attribute

bootstrapBaseLog

::mlir::IntegerAttr

32-bit signless integer attribute

keyswitchLevel

::mlir::IntegerAttr

32-bit signless integer attribute

keyswitchBaseLog

::mlir::IntegerAttr

32-bit signless integer attribute

packingKeySwitchInputLweDimension

::mlir::IntegerAttr

32-bit signless integer attribute

packingKeySwitchoutputPolynomialSize

::mlir::IntegerAttr

32-bit signless integer attribute

packingKeySwitchLevel

::mlir::IntegerAttr

32-bit signless integer attribute

packingKeySwitchBaseLog

::mlir::IntegerAttr

32-bit signless integer attribute

circuitBootstrapLevel

::mlir::IntegerAttr

32-bit signless integer attribute

circuitBootstrapBaseLog

::mlir::IntegerAttr

32-bit signless integer attribute

crtDecomposition

::mlir::ArrayAttr

64-bit integer array attribute

kskIndex

::mlir::IntegerAttr

32-bit signless integer attribute

bskIndex

::mlir::IntegerAttr

32-bit signless integer attribute

pkskIndex

::mlir::IntegerAttr

32-bit signless integer attribute

Operands:

Operand
Description

result

2D memref of 64-bit signless integer values

ciphertext

2D memref of 64-bit signless integer values

lookup_table

2D memref of 64-bit signless integer values

Concrete.wop_pbs_crt_lwe_tensor (::mlir::concretelang::Concrete::WopPBSCRTLweTensorOp)

Traits: AlwaysSpeculatableImplTrait

Interfaces: ConditionallySpeculatable, NoMemoryEffect (MemoryEffectOpInterface)

Effects: MemoryEffects::Effect{}

Attributes:

Attribute
MLIR Type
Description

bootstrapLevel

::mlir::IntegerAttr

32-bit signless integer attribute

bootstrapBaseLog

::mlir::IntegerAttr

32-bit signless integer attribute

keyswitchLevel

::mlir::IntegerAttr

32-bit signless integer attribute

keyswitchBaseLog

::mlir::IntegerAttr

32-bit signless integer attribute

packingKeySwitchInputLweDimension

::mlir::IntegerAttr

32-bit signless integer attribute

packingKeySwitchoutputPolynomialSize

::mlir::IntegerAttr

32-bit signless integer attribute

packingKeySwitchLevel

::mlir::IntegerAttr

32-bit signless integer attribute

packingKeySwitchBaseLog

::mlir::IntegerAttr

32-bit signless integer attribute

circuitBootstrapLevel

::mlir::IntegerAttr

32-bit signless integer attribute

circuitBootstrapBaseLog

::mlir::IntegerAttr

32-bit signless integer attribute

crtDecomposition

::mlir::ArrayAttr

64-bit integer array attribute

kskIndex

::mlir::IntegerAttr

32-bit signless integer attribute

bskIndex

::mlir::IntegerAttr

32-bit signless integer attribute

pkskIndex

::mlir::IntegerAttr

32-bit signless integer attribute

Operands:

Operand
Description

ciphertext

2D tensor of 64-bit signless integer values

lookupTable

2D tensor of 64-bit signless integer values

Results:

Result
Description

result

2D tensor of 64-bit signless integer values

Type definition

ContextType

A runtime context

Syntax: !Concrete.context

An abstract runtime context to pass contextual value, like public keys, ...

Making a release

This document explains how Zama people can release a new version of Concrete.

The process

Create the release branch if needed

All releases should be done on a release branch: our release branches are named release/MAJOR.MINOR.x (eg, release/2.7.x):

  • either you create a new version, then you need to create the new release branch (eg, the previous release was 2.6.x and now we release 2.7.0)

git branch release/MAJOR.MINOR.x
  • or you create a dot release: in this case you should cherry-pick commits on the branch of the release you want to fix (eg, the previous release was 2.7.0 and now we release 2.7.1).

The release/MAJOR.MINOR.x branch will be the branch from where all releases vMAJOR.MINOR.* will be done, and from where the gitbook documentation is built https://docs.zama.ai/concrete/v/MAJOR.MINOR.

Create a new draft release

Each push on the release branch will start all tests of Concrete. When you are happy with the state of the release branch, you need to update the API documentation:

./ci/scripts/make_apidocs.sh

If you miss it, the release worflow will stops on the release-checks steps on concrete_python_release.yml. Don't forget to push the updated API docs in the branch.

Then you just need to tag.

git tag vMAJOR.MINOR.REVISION
git push origin vMAJOR.MINOR.REVISION

This new tag push will start the release workflow: the workflow builds all release artifacts then create a new draft release on GitHub which you can find at https://github.com/zama-ai/concrete/releases/tag/vMAJOR.MINOR.REVISION.

You should edit the changelog and the release documentation, then make it reviewed by the product marketing team.

Create a new official release

When the new release documentation has been reviewed, you may save the release as a non draft release, then publish wheels on pypi using the https://github.com/zama-ai/concrete/actions/workflows/push_wheels_to_public_pypi.yml workflow, by setting the version number as MAJOR.MINOR.VERSION.

Artifacts to check

Follow the summary checklist:

At the end, check all the artifacts:

Tensorizing operations

This guide explains tensorization and how it can improve the execution time of Concrete circuits.

Tensors should be used instead of scalars when possible to maximize loop parallelism.

For example:

import time

import numpy as np
from concrete import fhe

inputset = fhe.inputset(fhe.uint6, fhe.uint6, fhe.uint6)
for tensorize in [False, True]:
    def f(x, y, z):
        return (
            np.sum(fhe.array([x, y, z]) ** 2)
            if tensorize
            else (x ** 2) + (y ** 2) + (z ** 2)
        )

    compiler = fhe.Compiler(f, {"x": "encrypted", "y": "encrypted", "z": "encrypted"})
    circuit = compiler.compile(inputset)

    circuit.keygen()
    for sample in inputset[:3]:  # warmup
        circuit.encrypt_run_decrypt(*sample)

    timings = []
    for sample in inputset[3:13]:
        start = time.time()
        result = circuit.encrypt_run_decrypt(*sample)
        end = time.time()

        assert np.array_equal(result, f(*sample))
        timings.append(end - start)

    if not tensorize:
        print(f"without tensorization -> {np.mean(timings):.03f}s")
    else:
        print(f"   with tensorization -> {np.mean(timings):.03f}s")

This prints:

without tensorization -> 0.214s
   with tensorization -> 0.118s

Enabling dataflow is kind of letting the runtime do this for you. It'd also help in the specific case.

Core features
Compilation
Execution/Analysis
Configure
Deploy
the awesome Zama repo
fhe.org
fully homomorphic
Optimization
Extensions
TFHE
TFHE
MLIR subproject
LLVM compiler infrastructure
dialects
documentation
source
documentation
source
documentation
source
documentation
source
documentation
source
documentation
source
documentation
source
MLIR's builtin Linalg Dialect
MLIR's SCF Dialect
MLIR's bufferization infrastructure
tags documentation
Bitwise
Comparisons
norm2
fhe.rounding_bit_pattern
fhe.bits
fhe.bits
Start here
Go further
TFHE
Table Lookup
configuration
approximate rounding
composition policy
Exactness
Exactness

Optimize cryptographic parameters

This guide explains how to help Concrete Optimizer to select more performant parameters to improve the execution time of Concrete circuits.

The idea is to obtain more optimal cryptographic parameters (especially for table lookups) without changing the operations within the circuit.

Call FHE circuits from other languages

After doing a compilation, we end up with a couple of artifacts, including crypto parameters and a binary file containing the executable circuit. In order to be able to encrypt and run the circuit properly, we need to know how to interpret these artifacts, and there are a couple of utility functions which can be used to load them. These utility functions can be accessed through a variety of languages, including Python and C++.

Demo

We will use a really simple example for a demo, but the same steps can be done for any other circuit. example.mlir will contain the MLIR below:

func.func @main(%arg0: tensor<4x4x!FHE.eint<6>>, %arg1: tensor<4x2xi7>) -> tensor<4x2x!FHE.eint<6>> {
   %0 = "FHELinalg.matmul_eint_int"(%arg0, %arg1): (tensor<4x4x!FHE.eint<6>>, tensor<4x2xi7>) -> (tensor<4x2x!FHE.eint<6>>)
   %tlu = arith.constant dense<[40, 13, 20, 62, 47, 41, 46, 30, 59, 58, 17, 4, 34, 44, 49, 5, 10, 63, 18, 21, 33, 45, 7, 14, 24, 53, 56, 3, 22, 29, 1, 39, 48, 32, 38, 28, 15, 12, 52, 35, 42, 11, 6, 43, 0, 16, 27, 9, 31, 51, 36, 37, 55, 57, 54, 2, 8, 25, 50, 23, 61, 60, 26, 19]> : tensor<64xi64>
   %result = "FHELinalg.apply_lookup_table"(%0, %tlu): (tensor<4x2x!FHE.eint<6>>, tensor<64xi64>) -> (tensor<4x2x!FHE.eint<6>>)
   return %result: tensor<4x2x!FHE.eint<6>>
}

You can use the concretecompiler binary to compile this MLIR program. Same can be done with concrete-python, as we only need the compilation artifacts at the end.

$ concretecompiler --action=compile -o python-demo example.mlir

You should be able to see artifacts listed in the python-demo directory

$ ls python-demo/
client_parameters.concrete.params.json  compilation_feedback.json  fhecircuit-client.h  sharedlib.so  staticlib.a

Now we want to use the Python bindings in order to call the compiled circuit.

from concrete.compiler import (ClientSupport, LambdaArgument, LibrarySupport)

The main struct to manage compilation artifacts is LibrarySupport. You will have to create one with the path you used during compilation, then load the result of the compilation

lib_support = LibrarySupport.new("/path/to/your/python-demo/")
compilation_result = lib_support.reload()

Using the compilation result, you can load the server lambda (the entrypoint to the executable compiled circuit) as well as the client parameters (containing crypto parameters)

server_lambda = lib_support.load_server_lambda(compilation_result)
client_params = lib_support.load_client_parameters(compilation_result)

The client parameters will serve the client to generate keys and encrypt arguments for the circuit

client_support = ClientSupport.new()
key_set = client_support.key_set(client_params)
args = [
	LambdaArgument.from_tensor_u8([1, 2, 3, 4, 1, 2, 3, 4, 1, 2, 3, 4, 1, 2, 3, 4], [4, 4]),
	LambdaArgument.from_tensor_u8([1, 2, 1, 2, 1, 2, 1, 2], [4, 2])
]
encrypted_args = client_support.encrypt_arguments(client_params, key_set, args)

Only evaluation keys are required for the execution of the circuit. You can execute the circuit on the encrypted arguments via server_lambda_call

eval_keys = key_set.get_evaluation_keys()
encrypted_result = lib_support.server_call(server_lambda, encrypted_args, eval_keys)

At this point you have the encrypted result and can decrypt it using the keyset which holds the secret key

result_arg = client_support.decrypt_result(client_params, key_set, encrypted_result)
print("result tensor dims: {}".format(result_arg.n_values()))
print("result tensor data: {}".format(result_arg.get_values()))

Formatting and drawing

This document explains how to format and draw a compiled circuit in Python.

Formatting

To convert your compiled circuit into its textual representation, use the str function:

str(circuit)

If you just want to see the output on your terminal, you can directly print it as well:

print(circuit)

Drawing

To draw your compiled circuit, use the draw method:

drawing = circuit.draw()

This method draws the circuit, saves it as a temporary PNG file and returns the file path.

You can display the drawing in a Jupyter notebook:

from PIL import Image
drawing = Image.open(circuit.draw())
drawing.show()
drawing.close()

Alternatively, you can use the show option of the draw method to display the drawing with matplotlib:

circuit.draw(show=True)

Using this option will clear any existing matplotlib plots.

Lastly, to save the drawing to a specific path, use the save_to option:

destination = "/tmp/path/of/your/choice.png"
drawing = circuit.draw(save_to=destination)
assert drawing == destination

Optimize table lookups

This guide teaches how costly table lookups are, and how to optimize them to improve the execution time of Concrete circuits.

The most costly operation in Concrete is the table lookup operation, so one of the primary goals of optimizing performance is to reduce the amount of table lookups.

Furthermore, the bit width of the input of the table lookup plays a major role in performance.

The code above prints:

Approximate mode

This guide teaches how to improve the execution time of Concrete circuits by using approximate mode for rounding.

prints:

Composition

For example:

This prints:

It means that specifying composition resulted in ~35% improvement to complexity for computing cube(square(x)).

Bit extraction

This guide teaches how to improve the execution time of Concrete circuits by using bit extraction.

For example:

prints:

That's almost 8x improvement to circuit complexity!

Implementation strategies

This guide teaches how to improve the execution time of Concrete circuits by using different conversion strategies for complex operations.

Concrete provides multiple implementation strategies for these complex operations:

The default strategy is the one that doesn't increase the input bit width, even if it's less optimal than the others. If you don't care about the input bit widths (e.g., if the inputs are only used in this operation), you should definitely change the default strategy.

Choosing the correct strategy can lead to big speedups. So if you are not sure which one to use, you can compile with different strategies and compare the complexity.

For example, the following code:

prints:

or:

prints:

As you can see, strategies can affect the performance a lot! So make sure to select the appropriate one for your use case if you want to optimize performance.

There is also a couple of tests in that can show how to both compile and run a circuit between a client and server using serialization.

Formatting is designed for debugging purpose only. It's not possible to create the circuit back from its textual representation. See if that's your goal.

Drawing functionality requires the installation of the package with the full feature set. See the section for instructions.

And displays:

You can enable to gain even more performance when using rounding by sacrificing some more exactness:

This guide explains how to optimize cryptographic parameters by specifying composition when using .

When using , make sure to specify so that the compiler can select more optimal parameters based on how the functions in the module would be used.

is a cheap way to extract certain bits of encrypted values. It can be very useful for improving the performance of circuits.

test_compilation.py
How to Deploy
Installation
import time

import numpy as np
import matplotlib.pyplot as plt
from concrete import fhe

def f(x):
    return x // 2

bit_widths = list(range(2, 9))
complexities = []
timings = []

for bit_width in bit_widths:
    inputset = fhe.inputset(lambda _: np.random.randint(0, 2 ** bit_width))

    compiler = fhe.Compiler(f, {"x": "encrypted"})
    circuit = compiler.compile(inputset)

    circuit.keygen()
    for sample in inputset[:3]:  # warmup
        circuit.encrypt_run_decrypt(*sample)

    current_timings = []
    for sample in inputset[3:13]:
        start = time.time()
        result = circuit.encrypt_run_decrypt(*sample)
        end = time.time()

        assert np.array_equal(result, f(*sample))
        current_timings.append(end - start)

    complexities.append(int(circuit.complexity))
    timings.append(float(np.mean(current_timings)))

    print(f"{bit_width} bits -> {complexities[-1]:>13_} complexity -> {timings[-1]:.06f}s")

figure, complexity_axis = plt.subplots()

color = "tab:red"
complexity_axis.set_xlabel("bit width")
complexity_axis.set_ylabel("complexity", color=color)
complexity_axis.plot(bit_widths, complexities, color=color)
complexity_axis.tick_params(axis="y", labelcolor=color)

timing_axis = complexity_axis.twinx()

color = 'tab:blue'
timing_axis.set_ylabel('execution time', color=color)
timing_axis.plot(bit_widths, timings, color=color)
timing_axis.tick_params(axis='y', labelcolor=color)

figure.tight_layout()
plt.show()
2 bits ->    29_944_416 complexity -> 0.019826s
3 bits ->    42_154_798 complexity -> 0.020093s
4 bits ->    61_979_934 complexity -> 0.021961s
5 bits ->    99_198_195 complexity -> 0.029475s
6 bits ->   230_210_706 complexity -> 0.062841s
7 bits ->   535_706_740 complexity -> 0.139669s
8 bits -> 1_217_510_420 complexity -> 0.318838s
import numpy as np
from concrete import fhe

inputset = fhe.inputset(fhe.uint10)
for lsbs_to_remove in range(0, 10):
    def f(x):
        return fhe.round_bit_pattern(x, lsbs_to_remove, exactness=fhe.Exactness.APPROXIMATE) // 2

    compiler = fhe.Compiler(f, {"x": "encrypted"})
    circuit = compiler.compile(inputset)

    print(f"{lsbs_to_remove=} -> {int(circuit.complexity):>13_} complexity")
lsbs_to_remove=0 -> 9_134_406_574 complexity
lsbs_to_remove=1 -> 5_548_275_712 complexity
lsbs_to_remove=2 -> 2_430_793_927 complexity
lsbs_to_remove=3 -> 1_058_638_119 complexity
lsbs_to_remove=4 ->   409_952_712 complexity
lsbs_to_remove=5 ->   172_138_947 complexity
lsbs_to_remove=6 ->    99_198_195 complexity
lsbs_to_remove=7 ->    71_644_380 complexity
lsbs_to_remove=8 ->    55_860_516 complexity
lsbs_to_remove=9 ->    50_978_148 complexity
import numpy as np
from concrete import fhe


@fhe.module()
class PowerWithoutComposition:
    @fhe.function({"x": "encrypted"})
    def square(x):
        return x ** 2

    @fhe.function({"x": "encrypted"})
    def cube(x):
        return x ** 3

without_composition = PowerWithoutComposition.compile(
    {
        "square": fhe.inputset(fhe.uint2),
        "cube": fhe.inputset(fhe.uint4),
    }
)
print(f"without composition -> {int(without_composition.complexity):>10_} complexity")


@fhe.module()
class PowerWithComposition:
    @fhe.function({"x": "encrypted"})
    def square(x):
        return x ** 2

    @fhe.function({"x": "encrypted"})
    def cube(x):
        return x ** 3

    composition = fhe.Wired(
        [
            fhe.Wire(fhe.Output(square, 0), fhe.Input(cube, 0))
        ]
    )

with_composition = PowerWithComposition.compile(
    {
        "square": fhe.inputset(fhe.uint2),
        "cube": fhe.inputset(fhe.uint4),
    }
)
print(f"   with composition -> {int(with_composition.complexity):>10_} complexity")
without composition -> 185_863_835 complexity
   with composition -> 135_871_612 complexity
import numpy as np
from concrete import fhe

inputset = fhe.inputset(fhe.uint6)
for bit_extraction in [False, True]:
    def is_even(x):
        return (
            x % 2 == 0
            if not bit_extraction
            else 1 - fhe.bits(x)[0]
        )

    compiler = fhe.Compiler(is_even, {"x": "encrypted"})
    circuit = compiler.compile(inputset)

    if not bit_extraction:
        print(f"without bit extraction -> {int(circuit.complexity):>11_} complexity")
    else:
        print(f"   with bit extraction -> {int(circuit.complexity):>11_} complexity")
without bit extraction -> 230_210_706 complexity
   with bit extraction ->  29_506_014 complexity
import numpy as np
from concrete import fhe

def f(x, y):
    return x & y

inputset = fhe.inputset(fhe.uint3, fhe.uint4)
strategies = [
    fhe.BitwiseStrategy.ONE_TLU_PROMOTED,
    fhe.BitwiseStrategy.THREE_TLU_CASTED,
    fhe.BitwiseStrategy.TWO_TLU_BIGGER_PROMOTED_SMALLER_CASTED,
    fhe.BitwiseStrategy.TWO_TLU_BIGGER_CASTED_SMALLER_PROMOTED,
    fhe.BitwiseStrategy.CHUNKED,
]

for strategy in strategies:
    compiler = fhe.Compiler(f, {"x": "encrypted", "y": "encrypted"})
    circuit = compiler.compile(inputset, bitwise_strategy_preference=strategy)
    print(
        f"{strategy:>55} "
        f"-> {circuit.programmable_bootstrap_count:>2} TLUs "
        f"-> {int(circuit.complexity):>12_} complexity"
    )
                       BitwiseStrategy.ONE_TLU_PROMOTED ->  1 TLUs ->  535_706_740 complexity
                       BitwiseStrategy.THREE_TLU_CASTED ->  3 TLUs ->  599_489_229 complexity
 BitwiseStrategy.TWO_TLU_BIGGER_PROMOTED_SMALLER_CASTED ->  2 TLUs ->  522_239_955 complexity
 BitwiseStrategy.TWO_TLU_BIGGER_CASTED_SMALLER_PROMOTED ->  2 TLUs ->  519_246_216 complexity
                                BitwiseStrategy.CHUNKED ->  6 TLUs ->  358_905_521 complexity
import numpy as np
from concrete import fhe

def f(x, y):
    return x == y

inputset = fhe.inputset(fhe.uint4, fhe.uint7)
strategies = [
    fhe.ComparisonStrategy.ONE_TLU_PROMOTED,
    fhe.ComparisonStrategy.THREE_TLU_CASTED,
    fhe.ComparisonStrategy.TWO_TLU_BIGGER_PROMOTED_SMALLER_CASTED,
    fhe.ComparisonStrategy.TWO_TLU_BIGGER_CASTED_SMALLER_PROMOTED,
    fhe.ComparisonStrategy.THREE_TLU_BIGGER_CLIPPED_SMALLER_CASTED,
    fhe.ComparisonStrategy.TWO_TLU_BIGGER_CLIPPED_SMALLER_PROMOTED,
    fhe.ComparisonStrategy.CHUNKED,
]

for strategy in strategies:
    compiler = fhe.Compiler(f, {"x": "encrypted", "y": "encrypted"})
    circuit = compiler.compile(inputset, comparison_strategy_preference=strategy)
    print(
        f"{strategy:>58} "
        f"-> {circuit.programmable_bootstrap_count:>2} TLUs "
        f"-> {int(circuit.complexity):>13_} complexity"
    )
                       ComparisonStrategy.ONE_TLU_PROMOTED ->  1 TLUs -> 1_217_510_420 complexity
                       ComparisonStrategy.THREE_TLU_CASTED ->  3 TLUs ->   751_172_128 complexity
 ComparisonStrategy.TWO_TLU_BIGGER_PROMOTED_SMALLER_CASTED ->  2 TLUs -> 1_043_702_103 complexity
 ComparisonStrategy.TWO_TLU_BIGGER_CASTED_SMALLER_PROMOTED ->  2 TLUs -> 1_898_305_707 complexity
ComparisonStrategy.THREE_TLU_BIGGER_CLIPPED_SMALLER_CASTED ->  3 TLUs ->   751_172_128 complexity
ComparisonStrategy.TWO_TLU_BIGGER_CLIPPED_SMALLER_PROMOTED ->  2 TLUs ->   682_694_770 complexity
                                ComparisonStrategy.CHUNKED ->  3 TLUs ->   751_172_128 complexity

Bitwise operations

This document describes how comparisons are managed in Concrete, typically "AND", "OR", and so on. It covers different strategies to make the FHE computations faster, depending on the context.

Bitwise operations are not native operations in Concrete, so they need to be implemented using existing native operations (i.e., additions, clear multiplications, negations, table lookups). Concrete offers two different implementations for performing bitwise operations.

Chunked

This is the most general implementation that can be used in any situation. The idea is:

# (example below is for bit-width of 8 and chunk size of 4)

# extract chunks of lhs using table lookups
lhs_chunks = [lhs.bits[0:4], lhs.bits[4:8]]

# extract chunks of rhs using table lookups
rhs_chunks = [rhs.bits[0:4], rhs.bits[4:8]]

# pack chunks of lhs and rhs using clear multiplications and additions 
packed_chunks = []
for lhs_chunk, rhs_chunk in zip(lhs_chunks, rhs_chunks):
    shifted_lhs_chunk = lhs_chunk * 2**4  # (i.e., lhs_chunk << 4)
    packed_chunks.append(shifted_lhs_chunk + rhs_chunk)

# apply comparison table lookup to packed chunks
bitwise_table = fhe.LookupTable([...])
result_chunks = bitwise_table[packed_chunks]

# sum resulting chunks obtain the result
result = np.sum(result_chunks)

Notes

  • Signed bitwise operations are not supported.

  • The optimal chunk size is selected automatically to reduce the number of table lookups.

  • Chunked bitwise operations result in at least 4 and at most 9 table lookups.

  • It is used if no other implementation can be used.

Pros

  • Can be used with any integers.

Cons

  • Very expensive.

Example

import numpy as np
from concrete import fhe

def f(x, y):
    return x & y

inputset = [
    (np.random.randint(0, 2**4), np.random.randint(0, 2**4))
    for _ in range(100)
]

compiler = fhe.Compiler(f, {"x": "encrypted", "y": "encrypted"})
circuit = compiler.compile(inputset, show_mlir=True)

produces

module {
  
  // no promotions
  func.func @main(%arg0: !FHE.eint<4>, %arg1: !FHE.eint<4>) -> !FHE.eint<4> {

    // extracting the first chunk of x, adjusted for shifting
    %cst = arith.constant dense<[0, 0, 0, 0, 4, 4, 4, 4, 8, 8, 8, 8, 12, 12, 12, 12]> : tensor<16xi64>
    %0 = "FHE.apply_lookup_table"(%arg0, %cst) : (!FHE.eint<4>, tensor<16xi64>) -> !FHE.eint<4>
        
    // extracting the first chunk of y
    %cst_0 = arith.constant dense<[0, 0, 0, 0, 1, 1, 1, 1, 2, 2, 2, 2, 3, 3, 3, 3]> : tensor<16xi64>
    %1 = "FHE.apply_lookup_table"(%arg1, %cst_0) : (!FHE.eint<4>, tensor<16xi64>) -> !FHE.eint<4>
        
    // packing the first chunks
    %2 = "FHE.add_eint"(%0, %1) : (!FHE.eint<4>, !FHE.eint<4>) -> !FHE.eint<4>
        
    // applying the bitwise operation to the first chunks, adjusted for addition in the end
    %cst_1 = arith.constant dense<[0, 0, 0, 0, 0, 4, 0, 4, 0, 0, 8, 8, 0, 4, 8, 12]> : tensor<16xi64>
    %3 = "FHE.apply_lookup_table"(%2, %cst_1) : (!FHE.eint<4>, tensor<16xi64>) -> !FHE.eint<4>
        
    // extracting the second chunk of x, adjusted for shifting
    %cst_2 = arith.constant dense<[0, 4, 8, 12, 0, 4, 8, 12, 0, 4, 8, 12, 0, 4, 8, 12]> : tensor<16xi64>
    %4 = "FHE.apply_lookup_table"(%arg0, %cst_2) : (!FHE.eint<4>, tensor<16xi64>) -> !FHE.eint<4>
        
    // extracting the second chunk of y
    %cst_3 = arith.constant dense<[0, 1, 2, 3, 0, 1, 2, 3, 0, 1, 2, 3, 0, 1, 2, 3]> : tensor<16xi64>
    %5 = "FHE.apply_lookup_table"(%arg1, %cst_3) : (!FHE.eint<4>, tensor<16xi64>) -> !FHE.eint<4>
        
    // packing the second chunks
    %6 = "FHE.add_eint"(%4, %5) : (!FHE.eint<4>, !FHE.eint<4>) -> !FHE.eint<4>
        
    // applying the bitwise operation to second chunks
    %cst_4 = arith.constant dense<[0, 0, 0, 0, 0, 1, 0, 1, 0, 0, 2, 2, 0, 1, 2, 3]> : tensor<16xi64>
    %7 = "FHE.apply_lookup_table"(%6, %cst_4) : (!FHE.eint<4>, tensor<16xi64>) -> !FHE.eint<4>
        
    // adding resulting chunks to obtain the result
    %8 = "FHE.add_eint"(%7, %3) : (!FHE.eint<4>, !FHE.eint<4>) -> !FHE.eint<4>
        
    return %8 : !FHE.eint<4>

  }
  
}

Packing Trick

This implementation uses the fact that we can combine two values into a single value and apply a single table lookup to this combined value!

There are two major problems with this implementation:

  1. packing requires the same bit-width across operands.

  2. packing requires the bit-width of at least x.bit_width + y.bit_width and that bit-width cannot exceed maximum TLU bit-width, which is 16 at the moment.

What this means is if we are comparing uint3 and uint6, we need to convert both of them to uint9 in some way to do the packing and proceed with the TLU in 9-bits. There are 4 ways to achieve this behavior.

Requirements

  • x.bit_width + y.bit_width <= MAXIMUM_TLU_BIT_WIDTH

1. fhe.BitwiseStrategy.ONE_TLU_PROMOTED

This strategy makes sure that during bit-width assignment, both operands are assigned the same bit-width, and that bit-width contains at least the amount of bits required to store pack(x, y). The idea is:

bitwise_lut = fhe.LookupTable([...])
result = bitwise_lut[pack(x_promoted_to_uint9, y_promoted_to_uint9)]

Pros

  • It will always result in a single table lookup.

Cons

  • It will significantly increase the bit-width of both operands and lock them to each other across the whole circuit, which can result in significant slowdowns if the operands are used in other costly operations.

Example

import numpy as np
from concrete import fhe

configuration = fhe.Configuration(
    bitwise_strategy_preference=fhe.BitwiseStrategy.ONE_TLU_PROMOTED,
)

def f(x, y):
    return x & y

inputset = [
    (np.random.randint(0, 2**4), np.random.randint(0, 2**4))
    for _ in range(100)
]

compiler = fhe.Compiler(f, {"x": "encrypted", "y": "encrypted"})
circuit = compiler.compile(inputset, configuration, show_mlir=True)

produces

module {
  
  // promotions          ............         ............
  func.func @main(%arg0: !FHE.eint<8>, %arg1: !FHE.eint<8>) -> !FHE.eint<4> {
    
    // packing
    %c16_i9 = arith.constant 16 : i9
    %0 = "FHE.mul_eint_int"(%arg0, %c16_i9) : (!FHE.eint<8>, i9) -> !FHE.eint<8>
    %1 = "FHE.add_eint"(%0, %arg1) : (!FHE.eint<8>, !FHE.eint<8>) -> !FHE.eint<8>
        
    // computing the result
    %cst = arith.constant dense<"..."> : tensor<256xi64>
    %2 = "FHE.apply_lookup_table"(%1, %cst) : (!FHE.eint<8>, tensor<256xi64>) -> !FHE.eint<4>
        
    return %2 : !FHE.eint<4>
        
  }
  
}

2. fhe.BitwiseStrategy.THREE_TLU_CASTED

This strategy will not put any constraint on bit-widths during bit-width assignment, instead operands are cast to a bit-width that can store pack(x, y) during runtime using table lookups. The idea is:

uint3_to_uint9_lut = fhe.LookupTable([...])
x_cast_to_uint9 = uint3_to_uint9_lut[x]

uint6_to_uint9_lut = fhe.LookupTable([...])
y_cast_to_uint9 = uint6_to_uint9_lut[y]

bitwise_lut = fhe.LookupTable([...])
result = bitwise_lut[pack(x_cast_to_uint9, y_cast_to_uint9)]

Notes

  • It can result in a single table lookup as well, if x and y are assigned (because of other operations) the same bit-width, and that bit-width can store pack(x, y).

  • Or in two table lookups if only one of the operands is assigned a bit-width bigger than or equal to the bit width that can store pack(x, y).

Pros

  • It will not put any constraints on bit-widths of the operands, which is amazing if they are used in other costly operations.

  • It will result in at most 3 table lookups, which is still good.

Cons

  • If you are not doing anything else with the operands, or doing less costly operations compared to bitwise, it will introduce up to two unnecessary table lookups and slow down execution compared to fhe.BitwiseStrategy.ONE_TLU_PROMOTED.

Example

import numpy as np
from concrete import fhe

configuration = fhe.Configuration(
    comparison_strategy_preference=fhe.BitwiseStrategy.THREE_TLU_CASTED,
)

def f(x, y):
    return x & y

inputset = [
    (np.random.randint(0, 2**4), np.random.randint(0, 2**4))
    for _ in range(100)
]

compiler = fhe.Compiler(f, {"x": "encrypted", "y": "encrypted"})
circuit = compiler.compile(inputset, configuration, show_mlir=True)

produces

module {
  
  // no promotions
  func.func @main(%arg0: !FHE.eint<4>, %arg1: !FHE.eint<4>) -> !FHE.eint<4> {
    
    // casting
    %cst = arith.constant dense<[0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15]> : tensor<16xi64>
    %0 = "FHE.apply_lookup_table"(%arg0, %cst) : (!FHE.eint<4>, tensor<16xi64>) -> !FHE.eint<8>
    %1 = "FHE.apply_lookup_table"(%arg1, %cst) : (!FHE.eint<4>, tensor<16xi64>) -> !FHE.eint<8>

    // packing
    %c16_i9 = arith.constant 16 : i9
    %2 = "FHE.mul_eint_int"(%0, %c16_i9) : (!FHE.eint<8>, i9) -> !FHE.eint<8>
    %3 = "FHE.add_eint"(%2, %1) : (!FHE.eint<8>, !FHE.eint<8>) -> !FHE.eint<8>
        
    // computing the result
    %cst_0 = arith.constant dense<"..."> : tensor<256xi64>
    %4 = "FHE.apply_lookup_table"(%3, %cst_0) : (!FHE.eint<8>, tensor<256xi64>) -> !FHE.eint<4>
        
    return %4 : !FHE.eint<4>
        
  }
  
}

3. fhe.BitwiseStrategy.TWO_TLU_BIGGER_PROMOTED_SMALLER_CASTED

This strategy can be viewed as a middle ground between the two strategies described above. With this strategy, only the bigger operand will be constrained to have at least the required bit-width to store pack(x, y), and the smaller operand will be cast to that bit-width during runtime. The idea is:

uint3_to_uint9_lut = fhe.LookupTable([...])
x_cast_to_uint9 = uint3_to_uint9_lut[x]

comparison_lut = fhe.LookupTable([...])
result = comparison_lut[x_cast_to_uint9 - y_promoted_to_uint9]

Notes

  • It can result in a single table lookup as well, if the smaller operand is assigned (because of other operations) the same bit-width as the bigger operand.

Pros

  • It will only put a constraint on the bigger operand, which is great if the smaller operand is used in other costly operations.

  • It will result in at most 2 table lookups, which is great.

Cons

  • It will significantly increase the bit-width of the bigger operand which can result in significant slowdowns if the bigger operand is used in other costly operations.

  • If you are not doing anything else with the smaller operand, or doing less costly operations compared to comparison, it could introduce an unnecessary table lookup and slow down execution compared to fhe.BitwiseStrategy.THREE_TLU_CASTED.

Example

import numpy as np
from concrete import fhe

configuration = fhe.Configuration(
    bitwise_strategy_preference=fhe.BitwiseStrategy.TWO_TLU_BIGGER_PROMOTED_SMALLER_CASTED,
)

def f(x, y):
    return x & y

inputset = [
    (np.random.randint(0, 2**3), np.random.randint(0, 2**6))
    for _ in range(100)
]

compiler = fhe.Compiler(f, {"x": "encrypted", "y": "encrypted"})
circuit = compiler.compile(inputset, configuration, show_mlir=True)

produces

module {
  
  // promotions                               ............
  func.func @main(%arg0: !FHE.eint<3>, %arg1: !FHE.eint<8>) -> !FHE.eint<3> {
    
    // casting smaller operand
    %cst = arith.constant dense<[0, 1, 2, 3, 4, 5, 6, 7]> : tensor<8xi64>
    %0 = "FHE.apply_lookup_table"(%arg0, %cst) : (!FHE.eint<3>, tensor<8xi64>) -> !FHE.eint<8>
        
    // packing
    %c32_i9 = arith.constant 32 : i9
    %1 = "FHE.mul_eint_int"(%0, %c32_i9) : (!FHE.eint<8>, i9) -> !FHE.eint<8>
    %2 = "FHE.add_eint"(%1, %arg1) : (!FHE.eint<8>, !FHE.eint<8>) -> !FHE.eint<8>
        
    // computing the result
    %cst_0 = arith.constant dense<"..."> : tensor<256xi64>
    %3 = "FHE.apply_lookup_table"(%2, %cst_0) : (!FHE.eint<8>, tensor<256xi64>) -> !FHE.eint<3>
        
    return %3 : !FHE.eint<3>
        
  }
  
}

4. fhe.BitwiseStrategy.TWO_TLU_BIGGER_CASTED_SMALLER_PROMOTED

This strategy is like the exact opposite of the strategy above. With this, only the smaller operand will be constrained to have at least the required bit-width, and the bigger operand will be cast during runtime. The idea is:

uint6_to_uint9_lut = fhe.LookupTable([...])
y_cast_to_uint9 = uint6_to_uint9_lut[y]

comparison_lut = fhe.LookupTable([...])
result = comparison_lut[x_promoted_to_uint9 - y_cast_to_uint9]

Notes

  • It can result in a single table lookup as well, if the bigger operand is assigned (because of other operations) the same bit-width as the smaller operand.

Pros

  • It will only put constraint on the smaller operand, which is great if the bigger operand is used in other costly operations.

  • It will result in at most 2 table lookups, which is great.

Cons

  • It will increase the bit-width of the smaller operand which can result in significant slowdowns if the smaller operand is used in other costly operations.

  • If you are not doing anything else with the bigger operand, or doing less costly operations compared to comparison, it could introduce an unnecessary table lookup and slow down execution compared to fhe.BitwiseStrategy.THREE_TLU_CASTED.

Example

import numpy as np
from concrete import fhe

configuration = fhe.Configuration(
    bitwise_strategy_preference=fhe.BitwiseStrategy.TWO_TLU_BIGGER_CASTED_SMALLER_PROMOTED,
)

def f(x, y):
    return x | y

inputset = [
    (np.random.randint(0, 2**3), np.random.randint(0, 2**6))
    for _ in range(100)
]

compiler = fhe.Compiler(f, {"x": "encrypted", "y": "encrypted"})
circuit = compiler.compile(inputset, configuration, show_mlir=True)

produces

module {
  
  // promotions          ............
  func.func @main(%arg0: !FHE.eint<9>, %arg1: !FHE.eint<6>) -> !FHE.eint<6> {
    
    // casting bigger operand
    %cst = arith.constant dense<[0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50, 51, 52, 53, 54, 55, 56, 57, 58, 59, 60, 61, 62, 63]> : tensor<64xi64>
    %0 = "FHE.apply_lookup_table"(%arg1, %cst) : (!FHE.eint<6>, tensor<64xi64>) -> !FHE.eint<9>
        
    // packing
    %c64_i10 = arith.constant 64 : i10
    %1 = "FHE.mul_eint_int"(%arg0, %c64_i10) : (!FHE.eint<9>, i10) -> !FHE.eint<9>
    %2 = "FHE.add_eint"(%1, %0) : (!FHE.eint<9>, !FHE.eint<9>) -> !FHE.eint<9>
        
    // computing the result
    %cst_0 = arith.constant dense<"..."> : tensor<512xi64>
    %3 = "FHE.apply_lookup_table"(%2, %cst_0) : (!FHE.eint<9>, tensor<512xi64>) -> !FHE.eint<6>
        
    return %3 : !FHE.eint<6>

  }
  
}

Summary

Strategy
Minimum # of TLUs
Maximum # of TLUs
Can increase the bit-width of the inputs

CHUNKED

4

9

ONE_TLU_PROMOTED

1

1

✓

THREE_TLU_CASTED

1

3

TWO_TLU_BIGGER_PROMOTED_SMALLER_CASTED

1

2

✓

TWO_TLU_BIGGER_CASTED_SMALLER_PROMOTED

1

2

✓

Concrete will choose the best strategy available after bit-width assignment, regardless of the specified preference.

Different strategies are good for different circuits. If you want the best runtime for your use case, you can compile your circuit with all different comparison strategy preferences, and pick the one with the lowest complexity.

Shifts

The same configuration option is used to modify the behavior of encrypted shift operations, and shifts are much more complex to implement, so we'll not go over the details. What is important is, the end the result is computed using additions or subtractions on the original shifted operand. Since additions and subtractions require the same bit-width across operands, input and output bit-widths need to be synchronized at some point. There are two ways to do this:

With promotion

Here, the shifted operand and shift result are assigned the same bit-width during bit-width assignment, which avoids an additional TLU on the shifted operand. On the other hand, it might increase the bit-width of the result or the shifted operand, and if they're used in other costly operations, it could result in significant slowdowns. This is the default behavior.

import numpy as np
from concrete import fhe

configuration = fhe.Configuration(
    shifts_with_promotion=True,
)

def f(x, y):
    return x << y

inputset = [
    (np.random.randint(0, 2**3), np.random.randint(0, 2**2))
    for _ in range(100)
]

compiler = fhe.Compiler(f, {"x": "encrypted", "y": "encrypted"})
circuit = compiler.compile(inputset, configuration, show_mlir=True)

produces

module {
  
  // promotions          ............
  func.func @main(%arg0: !FHE.eint<6>, %arg1: !FHE.eint<2>) -> !FHE.eint<6> {
    
    // shifting for the second bit of y
    %cst = arith.constant dense<[0, 0, 1, 1]> : tensor<4xi64>
    %0 = "FHE.apply_lookup_table"(%arg1, %cst) : (!FHE.eint<2>, tensor<4xi64>) -> !FHE.eint<4>
    %cst_0 = arith.constant dense<[0, 0, 0, 2, 2, 2, 4, 4, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0]> : tensor<64xi64>
    %1 = "FHE.apply_lookup_table"(%arg0, %cst_0) : (!FHE.eint<6>, tensor<64xi64>) -> !FHE.eint<4>
    %2 = "FHE.add_eint"(%1, %0) : (!FHE.eint<4>, !FHE.eint<4>) -> !FHE.eint<4>
    %cst_1 = arith.constant dense<[0, 0, 0, 8, 0, 16, 0, 24, 0, 32, 0, 40, 0, 48, 0, 56]> : tensor<16xi64>
    %3 = "FHE.apply_lookup_table"(%2, %cst_1) : (!FHE.eint<4>, tensor<16xi64>) -> !FHE.eint<6>
    %cst_2 = arith.constant dense<[0, 6, 12, 2, 8, 14, 4, 10, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0]> : tensor<64xi64>
    %4 = "FHE.apply_lookup_table"(%arg0, %cst_2) : (!FHE.eint<6>, tensor<64xi64>) -> !FHE.eint<4>
    %5 = "FHE.add_eint"(%4, %0) : (!FHE.eint<4>, !FHE.eint<4>) -> !FHE.eint<4>
    %cst_3 = arith.constant dense<[0, 0, 0, 1, 0, 2, 0, 3, 0, 4, 0, 5, 0, 6, 0, 7]> : tensor<16xi64>
    %6 = "FHE.apply_lookup_table"(%5, %cst_3) : (!FHE.eint<4>, tensor<16xi64>) -> !FHE.eint<6>
    %7 = "FHE.add_eint"(%3, %6) : (!FHE.eint<6>, !FHE.eint<6>) -> !FHE.eint<6>
    %8 = "FHE.add_eint"(%7, %arg0) : (!FHE.eint<6>, !FHE.eint<6>) -> !FHE.eint<6>
    
    // shifting for the first bit of y
    %cst_4 = arith.constant dense<[0, 1, 0, 1]> : tensor<4xi64>
    %9 = "FHE.apply_lookup_table"(%arg1, %cst_4) : (!FHE.eint<2>, tensor<4xi64>) -> !FHE.eint<4>
    %cst_5 = arith.constant dense<[0, 0, 0, 0, 0, 0, 0, 0, 2, 2, 2, 2, 2, 2, 2, 2, 4, 4, 4, 4, 4, 4, 4, 4, 6, 6, 6, 6, 6, 6, 6, 6, 8, 8, 8, 8, 8, 8, 8, 8, 10, 10, 10, 10, 10, 10, 10, 10, 12, 12, 12, 12, 12, 12, 12, 12, 14, 14, 14, 14, 14, 14, 14, 14]> : tensor<64xi64>
    %10 = "FHE.apply_lookup_table"(%8, %cst_5) : (!FHE.eint<6>, tensor<64xi64>) -> !FHE.eint<4>
    %11 = "FHE.add_eint"(%10, %9) : (!FHE.eint<4>, !FHE.eint<4>) -> !FHE.eint<4>
    %12 = "FHE.apply_lookup_table"(%11, %cst_1) : (!FHE.eint<4>, tensor<16xi64>) -> !FHE.eint<6>
    %cst_6 = arith.constant dense<[0, 2, 4, 6, 8, 10, 12, 14, 0, 2, 4, 6, 8, 10, 12, 14, 0, 2, 4, 6, 8, 10, 12, 14, 0, 2, 4, 6, 8, 10, 12, 14, 0, 2, 4, 6, 8, 10, 12, 14, 0, 2, 4, 6, 8, 10, 12, 14, 0, 2, 4, 6, 8, 10, 12, 14, 0, 2, 4, 6, 8, 10, 12, 14]> : tensor<64xi64>
    %13 = "FHE.apply_lookup_table"(%8, %cst_6) : (!FHE.eint<6>, tensor<64xi64>) -> !FHE.eint<4>
    %14 = "FHE.add_eint"(%13, %9) : (!FHE.eint<4>, !FHE.eint<4>) -> !FHE.eint<4>
    %15 = "FHE.apply_lookup_table"(%14, %cst_3) : (!FHE.eint<4>, tensor<16xi64>) -> !FHE.eint<6>
    %16 = "FHE.add_eint"(%12, %15) : (!FHE.eint<6>, !FHE.eint<6>) -> !FHE.eint<6>
    %17 = "FHE.add_eint"(%16, %8) : (!FHE.eint<6>, !FHE.eint<6>) -> !FHE.eint<6>
        
    return %17 : !FHE.eint<6>
        
  }
  
}

With casting

The approach described above could be suboptimal for some circuits, so it is advised to check the complexity with it disabled before production. Here is how the implementation changes with it disabled.

import numpy as np
from concrete import fhe

configuration = fhe.Configuration(
    shifts_with_promotion=False,
)

def f(x, y):
    return x << y

inputset = [
    (np.random.randint(0, 2**3), np.random.randint(0, 2**2))
    for _ in range(100)
]

compiler = fhe.Compiler(f, {"x": "encrypted", "y": "encrypted"})
circuit = compiler.compile(inputset, configuration, show_mlir=True)

produces

module {
  
  // no promotions
  func.func @main(%arg0: !FHE.eint<3>, %arg1: !FHE.eint<2>) -> !FHE.eint<6> {
    
    // shifting for the second bit of y
    %cst = arith.constant dense<[0, 0, 1, 1]> : tensor<4xi64>
    %0 = "FHE.apply_lookup_table"(%arg1, %cst) : (!FHE.eint<2>, tensor<4xi64>) -> !FHE.eint<4>
    %cst_0 = arith.constant dense<[0, 0, 0, 2, 2, 2, 4, 4]> : tensor<8xi64>
    %1 = "FHE.apply_lookup_table"(%arg0, %cst_0) : (!FHE.eint<3>, tensor<8xi64>) -> !FHE.eint<4>
    %2 = "FHE.add_eint"(%1, %0) : (!FHE.eint<4>, !FHE.eint<4>) -> !FHE.eint<4>
    %cst_1 = arith.constant dense<[0, 0, 0, 8, 0, 16, 0, 24, 0, 32, 0, 40, 0, 48, 0, 56]> : tensor<16xi64>
    %3 = "FHE.apply_lookup_table"(%2, %cst_1) : (!FHE.eint<4>, tensor<16xi64>) -> !FHE.eint<6>
    %cst_2 = arith.constant dense<[0, 6, 12, 2, 8, 14, 4, 10]> : tensor<8xi64>
    %4 = "FHE.apply_lookup_table"(%arg0, %cst_2) : (!FHE.eint<3>, tensor<8xi64>) -> !FHE.eint<4>
    %5 = "FHE.add_eint"(%4, %0) : (!FHE.eint<4>, !FHE.eint<4>) -> !FHE.eint<4>
    %cst_3 = arith.constant dense<[0, 0, 0, 1, 0, 2, 0, 3, 0, 4, 0, 5, 0, 6, 0, 7]> : tensor<16xi64>
    %6 = "FHE.apply_lookup_table"(%5, %cst_3) : (!FHE.eint<4>, tensor<16xi64>) -> !FHE.eint<6>
    %7 = "FHE.add_eint"(%3, %6) : (!FHE.eint<6>, !FHE.eint<6>) -> !FHE.eint<6>
        
    // cast x to 6-bits to compute the result using addition/subtraction
    %cst_4 = arith.constant dense<[0, 1, 2, 3, 4, 5, 6, 7]> : tensor<8xi64>
    %8 = "FHE.apply_lookup_table"(%arg0, %cst_4) : (!FHE.eint<3>, tensor<8xi64>) -> !FHE.eint<6>
    // this was done using promotion instead of casting in runtime when the flag was turned on
        
    %9 = "FHE.add_eint"(%7, %8) : (!FHE.eint<6>, !FHE.eint<6>) -> !FHE.eint<6>
        
    // shifting for the first bit of y
    %cst_5 = arith.constant dense<[0, 1, 0, 1]> : tensor<4xi64>
    %10 = "FHE.apply_lookup_table"(%arg1, %cst_5) : (!FHE.eint<2>, tensor<4xi64>) -> !FHE.eint<4>
    %cst_6 = arith.constant dense<[0, 0, 0, 0, 0, 0, 0, 0, 2, 2, 2, 2, 2, 2, 2, 2, 4, 4, 4, 4, 4, 4, 4, 4, 6, 6, 6, 6, 6, 6, 6, 6, 8, 8, 8, 8, 8, 8, 8, 8, 10, 10, 10, 10, 10, 10, 10, 10, 12, 12, 12, 12, 12, 12, 12, 12, 14, 14, 14, 14, 14, 14, 14, 14]> : tensor<64xi64>
    %11 = "FHE.apply_lookup_table"(%9, %cst_6) : (!FHE.eint<6>, tensor<64xi64>) -> !FHE.eint<4>
    %12 = "FHE.add_eint"(%11, %10) : (!FHE.eint<4>, !FHE.eint<4>) -> !FHE.eint<4>
    %13 = "FHE.apply_lookup_table"(%12, %cst_1) : (!FHE.eint<4>, tensor<16xi64>) -> !FHE.eint<6>
    %cst_7 = arith.constant dense<[0, 2, 4, 6, 8, 10, 12, 14, 0, 2, 4, 6, 8, 10, 12, 14, 0, 2, 4, 6, 8, 10, 12, 14, 0, 2, 4, 6, 8, 10, 12, 14, 0, 2, 4, 6, 8, 10, 12, 14, 0, 2, 4, 6, 8, 10, 12, 14, 0, 2, 4, 6, 8, 10, 12, 14, 0, 2, 4, 6, 8, 10, 12, 14]> : tensor<64xi64>
    %14 = "FHE.apply_lookup_table"(%9, %cst_7) : (!FHE.eint<6>, tensor<64xi64>) -> !FHE.eint<4>
    %15 = "FHE.add_eint"(%14, %10) : (!FHE.eint<4>, !FHE.eint<4>) -> !FHE.eint<4>
    %16 = "FHE.apply_lookup_table"(%15, %cst_3) : (!FHE.eint<4>, tensor<16xi64>) -> !FHE.eint<6>
    %17 = "FHE.add_eint"(%13, %16) : (!FHE.eint<6>, !FHE.eint<6>) -> !FHE.eint<6>
    %18 = "FHE.add_eint"(%17, %9) : (!FHE.eint<6>, !FHE.eint<6>) -> !FHE.eint<6>
        
    return %18 : !FHE.eint<6>
  }
  
}
modules
Bit extraction
comparisons (<,<=,==,!=,>=,>)
bitwise operations (<<,&,|,^,>>)
minimum and maximum operations
approximate mode
modules
composition
multivariate extension
Bitwise#Shifts

Contributing

There are two ways to contribute to Concrete. You can:

  • Open issues to report bugs and typos or suggest ideas;

  • Request to become an official contributor by emailing hello@zama.ai. Only approved contributors can send pull requests (PRs), so get in touch before you do.

Security

This document describes some security concepts around FHE that can help you generate parameters that are both secure and correct.

Parameter Curves

  1. Data Acquisition

    • For a given value of (n,q=264,σ)(n, q = 2^{64}, \sigma)(n,q=264,σ) we obtain raw data from the Lattice Estimator, which ultimately leads to a security level λ\lambdaλ. All relevant attacks in the Lattice Estimator are considered.

    • Increase the value of σ\sigmaσ, until the tuple (n,q=264,σ)(n, q = 2^{64}, \sigma)(n,q=264,σ) satisfies the target level of security λtarget\lambda_{target}λtarget​.

    • Repeat for several values of nnn.

  2. Model Generation for λ=λtarget\lambda = \lambda_{target}λ=λtarget​.

    • At this point, we have several sets of points {(n,q=264,σ)}\{(n, q = 2^{64}, \sigma)\}{(n,q=264,σ)} satisfying the target level of security λtarget\lambda_{target}λtarget​. From here, we fit a model to this raw data (σ\sigmaσ as a function of nnn).

  3. Model Verification.

    • For each model, we perform a verification check to ensure that the values output from the function σ(n)\sigma(n)σ(n) provide the claimed level of security, λtarget\lambda_{target}λtarget​.

These models are then used as input for Concrete, to ensure that the parameter space explored by the compiler attains the required security level. Note that we consider the RC.BDGL16 lattice reduction cost model within the Lattice Estimator. Therefore, when computing our security estimates, we use the call LWE.estimate(params, red_cost_model = RC.BDGL16) on the input parameter set params.

The cryptographic parameters are chosen considering the IND-CPA security model, and are selected with a bootstrapping failure probability fixed by the user. In particular, it is assumed that the results of decrypted computations are not shared by the secret key owner with any third parties, as such an action can lead to leakage of the secret encryption key. If you are designing an application where decryptions must be shared, you will need to craft custom encryption parameters which are chosen in consideration of the IND-CPA^D security model [1].

[1] Li, Baiyu, et al. “Securing approximate homomorphic encryption using differential privacy.” Annual International Cryptology Conference. Cham: Springer Nature Switzerland, 2022. https://eprint.iacr.org/2022/816.pdf

Usage

To generate the raw data from the lattice estimator, use::

make generate-curves

by default, this script will generate parameter curves for {80, 112, 128, 192} bits of security, using log2(q)=64log_2(q) = 64log2​(q)=64.

To compare the current curves with the output of the lattice estimator, use:

make compare-curves

To generate the associated cpp and rust code, use::

make generate-code

further advanced options can be found inside the Makefile.

Example

sage: X = load("128.sobj")

entries are tuples of the form: (n,log2(q),log2(σ),λ)(n, log_2(q), log_2(\sigma), \lambda)(n,log2​(q),log2​(σ),λ). We can view individual entries via::

sage: X["128"][0]
(2366, 64.0, 4.0, 128.51)
sage: curves = load("verified_curves.sobj")

This object is a tuple containing the information required for the four security curves ({80, 112, 128, 192} bits of security). Looking at one of the entries:

sage: curves[2][:3]
(-0.026599462343105267, 2.981543184145991, 128)

Here we can see the linear model parameters (a=−0.026599462343105267,b=2.981543184145991)(a = -0.026599462343105267, b = 2.981543184145991)(a=−0.026599462343105267,b=2.981543184145991) along with the security level 128. This linear model can be used to generate secure parameters in the following way: for q=264q = 2^{64}q=264, if we have an LWE dimension of n=1536n = 1536n=1536, then the required noise size is:

σ=a∗n+b=−37.85\sigma = a * n + b = -37.85σ=a∗n+b=−37.85

This value corresponds to the logarithm of the relative error size. Using the parameter set (n,log(q),σ=264−37.85)(n, log(q), \sigma = 2^{64 - 37.85})(n,log(q),σ=264−37.85) in the Lattice Estimator confirms a 128-bit security level.

Min/Max operations

This document explains how to compute minimum and maximum between values in Concrete, covering different strategies to make computations faster, depending on the strategy.

Finding the minimum or maximum of two numbers is not a native operation in Concrete, so it needs to be implemented using existing native operations (i.e., additions, clear multiplications, negations, table lookups). Concrete offers two different implementations for this.

Chunked

This is the most general implementation that can be used in any situation. The idea is:

# (example below is for bit-width of 8 and chunk size of 4)

# compare lhs and rhs
select_lhs = lhs < rhs  # or lhs > rhs for maximum

# multiply lhs with select_lhs
lhs_contribution = lhs * select_lhs

# multiply rhs with 1 - select_lhs
rhs_contribution = rhs * (1 - select_lhs)

# compute the result
result = lhs_contribution + rhs_contribution

Notes

  • Multiplication with operands aren't allowed to increase the bit-width of the inputs, so they are very expensive as well.

  • Optimal chunk size is selected automatically to reduce the number of table lookups.

  • Chunked comparisons result in at least 9 and at most 21 table lookups.

  • It is used if no other implementation can be used.

Pros

  • Can be used with any integers.

Cons

  • Extremely expensive.

Example

import numpy as np
from concrete import fhe

def f(x, y):
    return np.minimum(x, y)

inputset = [
    (np.random.randint(0, 2**4), np.random.randint(0, 2**4))
    for _ in range(100)
]

compiler = fhe.Compiler(f, {"x": "encrypted", "y": "encrypted"})
circuit = compiler.compile(inputset, show_mlir=True)

produces

module {

  func.func @main(%arg0: !FHE.eint<4>, %arg1: !FHE.eint<4>) -> !FHE.eint<4> {
  
    // calculating select_x, which is x < y since we're computing the minimum
    %cst = arith.constant dense<[0, 0, 0, 0, 4, 4, 4, 4, 8, 8, 8, 8, 12, 12, 12, 12]> : tensor<16xi64>
    %0 = "FHE.apply_lookup_table"(%arg0, %cst) : (!FHE.eint<4>, tensor<16xi64>) -> !FHE.eint<4>
    %cst_0 = arith.constant dense<[0, 0, 0, 0, 1, 1, 1, 1, 2, 2, 2, 2, 3, 3, 3, 3]> : tensor<16xi64>
    %1 = "FHE.apply_lookup_table"(%arg1, %cst_0) : (!FHE.eint<4>, tensor<16xi64>) -> !FHE.eint<4>
    %2 = "FHE.add_eint"(%0, %1) : (!FHE.eint<4>, !FHE.eint<4>) -> !FHE.eint<4>
    %cst_1 = arith.constant dense<[0, 1, 1, 1, 2, 0, 1, 1, 2, 2, 0, 1, 2, 2, 2, 0]> : tensor<16xi64>
    %3 = "FHE.apply_lookup_table"(%2, %cst_1) : (!FHE.eint<4>, tensor<16xi64>) -> !FHE.eint<4>
    %cst_2 = arith.constant dense<[0, 4, 8, 12, 0, 4, 8, 12, 0, 4, 8, 12, 0, 4, 8, 12]> : tensor<16xi64>
    %4 = "FHE.apply_lookup_table"(%arg0, %cst_2) : (!FHE.eint<4>, tensor<16xi64>) -> !FHE.eint<4>
    %cst_3 = arith.constant dense<[0, 1, 2, 3, 0, 1, 2, 3, 0, 1, 2, 3, 0, 1, 2, 3]> : tensor<16xi64>
    %5 = "FHE.apply_lookup_table"(%arg1, %cst_3) : (!FHE.eint<4>, tensor<16xi64>) -> !FHE.eint<4>
    %6 = "FHE.add_eint"(%4, %5) : (!FHE.eint<4>, !FHE.eint<4>) -> !FHE.eint<4>
    %cst_4 = arith.constant dense<[0, 4, 4, 4, 8, 0, 4, 4, 8, 8, 0, 4, 8, 8, 8, 0]> : tensor<16xi64>
    %7 = "FHE.apply_lookup_table"(%6, %cst_4) : (!FHE.eint<4>, tensor<16xi64>) -> !FHE.eint<4>
    %8 = "FHE.add_eint"(%7, %3) : (!FHE.eint<4>, !FHE.eint<4>) -> !FHE.eint<4>
    %cst_5 = arith.constant dense<[0, 1, 0, 0, 1, 1, 0, 0, 0, 1, 0, 0, 0, 1, 0, 0]> : tensor<16xi64>
    %9 = "FHE.apply_lookup_table"(%8, %cst_5) : (!FHE.eint<4>, tensor<16xi64>) -> !FHE.eint<1>
    
    // extracting the first 2 bits of x shifhted to left by 1 bits for packing
    %cst_6 = arith.constant dense<[0, 2, 4, 6, 0, 2, 4, 6, 0, 2, 4, 6, 0, 2, 4, 6]> : tensor<16xi64>
    %10 = "FHE.apply_lookup_table"(%arg0, %cst_6) : (!FHE.eint<4>, tensor<16xi64>) -> !FHE.eint<3>
    
    // casting select_x to 3 bits for packing
    %cst_7 = arith.constant dense<[0, 1]> : tensor<2xi64>
    %11 = "FHE.apply_lookup_table"(%9, %cst_7) : (!FHE.eint<1>, tensor<2xi64>) -> !FHE.eint<3>
    
    // packing the first 2 bits of x with select_x
    %12 = "FHE.add_eint"(%10, %11) : (!FHE.eint<3>, !FHE.eint<3>) -> !FHE.eint<3>
    
    // calculating contribution of 0 if select_x is 0 else the first 2 bits of x
    %cst_8 = arith.constant dense<[0, 0, 0, 1, 0, 2, 0, 3]> : tensor<8xi64>
    %13 = "FHE.apply_lookup_table"(%12, %cst_8) : (!FHE.eint<3>, tensor<8xi64>) -> !FHE.eint<4>
    
    // extracting the last 2 bits of x shifhted to the left by 1 bit for packing
    %cst_9 = arith.constant dense<[0, 0, 0, 0, 2, 2, 2, 2, 4, 4, 4, 4, 6, 6, 6, 6]> : tensor<16xi64>
    %14 = "FHE.apply_lookup_table"(%arg0, %cst_9) : (!FHE.eint<4>, tensor<16xi64>) -> !FHE.eint<3>
    
    // packing the last 2 bits of x with select_x
    %15 = "FHE.add_eint"(%14, %11) : (!FHE.eint<3>, !FHE.eint<3>) -> !FHE.eint<3>
    
    // calculating contribution of 0 if select_x is 0 else the last 2 bits of x shifted by 2 bits for direct addition
    %cst_10 = arith.constant dense<[0, 0, 0, 4, 0, 8, 0, 12]> : tensor<8xi64>
    %16 = "FHE.apply_lookup_table"(%15, %cst_10) : (!FHE.eint<3>, tensor<8xi64>) -> !FHE.eint<4>
    
    // computing x * select_x
    %17 = "FHE.add_eint"(%13, %16) : (!FHE.eint<4>, !FHE.eint<4>) -> !FHE.eint<4>
    
    // extracting the first 2 bits of y shifhted to the left by 1 bit for packing
    %18 = "FHE.apply_lookup_table"(%arg1, %cst_6) : (!FHE.eint<4>, tensor<16xi64>) -> !FHE.eint<3>
    
    // packing the first 2 bits of y with select_x
    %19 = "FHE.add_eint"(%18, %11) : (!FHE.eint<3>, !FHE.eint<3>) -> !FHE.eint<3>
    
    // calculating contribution of 0 if select_x is 1 else the first 2 bits of y
    %cst_11 = arith.constant dense<[0, 0, 1, 0, 2, 0, 3, 0]> : tensor<8xi64>
    %20 = "FHE.apply_lookup_table"(%19, %cst_11) : (!FHE.eint<3>, tensor<8xi64>) -> !FHE.eint<4>
    
    // extracting the last 2 bits of y shifhted to left by 1 bit for packing
    %21 = "FHE.apply_lookup_table"(%arg1, %cst_9) : (!FHE.eint<4>, tensor<16xi64>) -> !FHE.eint<3>
    
    // packing the last 2 bits of y with select_x
    %22 = "FHE.add_eint"(%21, %11) : (!FHE.eint<3>, !FHE.eint<3>) -> !FHE.eint<3>
    
    // calculating contribution of 0 if select_x is 1 else the last 2 bits of y shifted by 2 bits for direct addition
    %cst_12 = arith.constant dense<[0, 0, 4, 0, 8, 0, 12, 0]> : tensor<8xi64>
    %23 = "FHE.apply_lookup_table"(%22, %cst_12) : (!FHE.eint<3>, tensor<8xi64>) -> !FHE.eint<4>
    
    // computing y * (1 - select_x)
    %24 = "FHE.add_eint"(%20, %23) : (!FHE.eint<4>, !FHE.eint<4>) -> !FHE.eint<4>
    
    // computing the result
    %25 = "FHE.add_eint"(%17, %24) : (!FHE.eint<4>, !FHE.eint<4>) -> !FHE.eint<4>

    return %25 : !FHE.eint<4>
    
  }
  
}

Min/Max Trick

This implementation uses the fact that [min,max](x, y) is equal to [min, max](x - y, 0) + y, which is just a subtraction, a table lookup and an addition!

There are two major problems with this implementation though:

  1. subtraction before the TLU requires up to 2 additional bits to avoid overflows (it is 1 in most cases).

  2. subtraction and addition require the same bit-width across operands.

What this means is that if we are comparing uint3 and uint6, we need to convert both of them to uint7 in some way to do the subtraction and proceed with the TLU in 7-bits. There are 2 ways to achieve this behavior.

Requirements

  • (x - y).bit_width <= MAXIMUM_TLU_BIT_WIDTH

1. fhe.MinMaxStrategy.ONE_TLU_PROMOTED

This strategy makes sure that during bit-width assignment, both operands are assigned the same bit-width, and that bit-width contains at least the amount of bits required to store x - y. The idea is:

comparison_lut = fhe.LookupTable([...])
result = comparison_lut[x_promoted_to_uint7 - y_promoted_to_uint7] + y_promoted_to_uint7

Pros

  • It will always result in a single table lookup.

Cons

  • It will increase the bit-width of both operands and the result, and lock them together across the whole circuit, which can result in significant slowdowns if the result or the operands are used in other costly operations.

Example

import numpy as np
from concrete import fhe

configuration = fhe.Configuration(
    min_max_strategy_preference=fhe.MinMaxStrategy.ONE_TLU_PROMOTED,
)

def f(x, y):
    return np.minimum(x, y)

inputset = [
    (np.random.randint(0, 2**4), np.random.randint(0, 2**2))
    for _ in range(100)
]

compiler = fhe.Compiler(f, {"x": "encrypted", "y": "encrypted"})
circuit = compiler.compile(inputset, configuration, show_mlir=True)

produces

module {

  // promotions          ............         ............
  func.func @main(%arg0: !FHE.eint<5>, %arg1: !FHE.eint<5>) -> !FHE.eint<5> {
  
    // subtraction
    %0 = "FHE.to_signed"(%arg0) : (!FHE.eint<5>) -> !FHE.esint<5>
    %1 = "FHE.to_signed"(%arg1) : (!FHE.eint<5>) -> !FHE.esint<5>
    %2 = "FHE.sub_eint"(%0, %1) : (!FHE.esint<5>, !FHE.esint<5>) -> !FHE.esint<5>
    
    // tlu
    %cst = arith.constant dense<[0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, -16, -15, -14, -13, -12, -11, -10, -9, -8, -7, -6, -5, -4, -3, -2, -1]> : tensor<32xi64>
    %3 = "FHE.apply_lookup_table"(%2, %cst) : (!FHE.esint<5>, tensor<32xi64>) -> !FHE.eint<5>
    
    // addition
    %4 = "FHE.add_eint"(%3, %arg1) : (!FHE.eint<5>, !FHE.eint<5>) -> !FHE.eint<5>
    
    return %4 : !FHE.eint<5>
    
  }
  
}

2. fhe.MinMaxStrategy.THREE_TLU_CASTED

This strategy will not put any constraint on bit-widths during bit-width assignment. Instead, operands are cast to a bit-width that can store x - y during runtime using table lookups. The idea is:

uint3_to_uint7_lut = fhe.LookupTable([...])
x_cast_to_uint7 = uint3_to_uint7_lut[x]

uint6_to_uint7_lut = fhe.LookupTable([...])
y_cast_to_uint7 = uint6_to_uint7_lut[y]

comparison_lut = fhe.LookupTable([...])
result = comparison_lut[x_cast_to_uint7 - y_cast_to_uint7] + y

Notes

  • It can result in a single table lookup as well, if x and y are assigned (because of other operations) the same bit-width, and that bit-width can store x - y.

  • Or in two table lookups if only one of the operands is assigned a bit-width bigger than or equal to the bit width that can store x - y.

Pros

  • It will not put any constraints on bit-widths of the operands, which is amazing if they are used in other costly operations.

  • It will result in at most 3 table lookups, which is still good.

Cons

  • If you are not doing anything else with the operands, or doing less costly operations compared to comparison, it will introduce up to two unnecessary table lookups and slow down execution compared to fhe.MinMaxStrategy.ONE_TLU_PROMOTED.

Example

import numpy as np
from concrete import fhe

configuration = fhe.Configuration(
    min_max_strategy_preference=fhe.MinMaxStrategy.THREE_TLU_CASTED,
)

def f(x, y):
    return np.minimum(x, y)

inputset = [
    (np.random.randint(0, 2**4), np.random.randint(0, 2**2))
    for _ in range(100)
]

compiler = fhe.Compiler(f, {"x": "encrypted", "y": "encrypted"})
circuit = compiler.compile(inputset, configuration, show_mlir=True)

produces

module {

  // no promotions
  func.func @main(%arg0: !FHE.eint<4>, %arg1: !FHE.eint<2>) -> !FHE.eint<2> {
  
    // casting x
    %cst = arith.constant dense<[0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15]> : tensor<16xi64>
    %0 = "FHE.apply_lookup_table"(%arg0, %cst) : (!FHE.eint<4>, tensor<16xi64>) -> !FHE.esint<5>
    
    // casting y
    %cst_0 = arith.constant dense<[0, 1, 2, 3]> : tensor<4xi64>
    %1 = "FHE.apply_lookup_table"(%arg1, %cst_0) : (!FHE.eint<2>, tensor<4xi64>) -> !FHE.esint<5>
    
    // subtraction
    %2 = "FHE.sub_eint"(%0, %1) : (!FHE.esint<5>, !FHE.esint<5>) -> !FHE.esint<5>
    
    // tlu
    %cst_1 = arith.constant dense<[0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, -16, -15, -14, -13, -12, -11, -10, -9, -8, -7, -6, -5, -4, -3, -2, -1]> : tensor<32xi64>
    %3 = "FHE.apply_lookup_table"(%2, %cst_1) : (!FHE.esint<5>, tensor<32xi64>) -> !FHE.eint<2>
    
    // addition
    %4 = "FHE.add_eint"(%3, %arg1) : (!FHE.eint<2>, !FHE.eint<2>) -> !FHE.eint<2>
    
    return %4 : !FHE.eint<2>
    
  }
  
}

Summary

Strategy
Minimum # of TLUs
Maximum # of TLUs
Can increase the bit-width of the inputs

CHUNKED

9

21

ONE_TLU_PROMOTED

1

1

✓

THREE_TLU_CASTED

1

3

Concrete will choose the best strategy available after bit-width assignment, regardless of the specified preference.

Different strategies are good for different circuits. If you want the best runtime for your use case, you can compile your circuit with all different comparison strategy preferences, and pick the one with the lowest complexity.

Optimizer

concrete-optimizer is a tool that selects appropriate cryptographic parameters for a given fully homomorphic encryption (FHE) computation. These parameters have an impact on the security, correctness, and efficiency of the computation.

The cryptographic parameters are degrees of freedom in the FHE algorithms (bootstrapping, keyswitching, etc.) that need to be fixed. The search space for possible crypto-parameters is finite but extremely large. The role of the optimizer is to quickly find the most efficient crypto-parameters possible while guaranteeing security and correctness.

Security, Correctness, and Efficiency

Security

The security level is chosen by the user. We typically operate at a fixed security level, such as 128 bits, to ensure that there is never a trade-off between security and efficiency. This constraint imposes a minimum amount of noise in all ciphertexts.

Correctness

Correctness decreases as the level of noise increases. Noise accumulates during homomorphic computation until it is actively reduced via bootstrapping. Too much noise can lead to the result of a computation being inaccurate or completely incorrect.

Before optimization, we compute a noise bound that guarantees a given error level (under the assumption that noise growth is correctly managed via bootstrapping). The noise growth depends on a critical quantity: the 2-norm of any dot product (or equivalent) present in the calculus. This 2-norm changes the scale of the noise, so we must reduce it sufficiently for the next dot product operation whenever we reduce the noise.

The user can control error probability in two ways: via the PBS error probability and the global error probability.

The PBS error probability controls correctness locally (i.e., represents the error probability of a single PBS operation), while the global error probability focuses on the overall computation result (i.e., represents the error probability of the entire computation). These probabilities are related, and choosing which one to use may depend on the specific use case.

Efficiency

Efficiency decreases as more precision is required, e.g. 7-bits versus 8-bits. The larger the 2-norm is, the bigger the noise will be after a dot product. To remain below the noise bound, we must ensure that the inputs to the dot product have a sufficiently small noise level. The smaller this noise is, the slower the previous bootstrapping will be. Therefore, the larger the 2norm is, the slower the computation will be.

How are the parameters optimized

The optimization prioritizes security and correctness. This means that the security level (or the probability of correctness) could, in practice, be a bit higher than the level which is requested by the user.

In the simplest case, the optimizer performs an exhaustive search in the full parameter space and selects the best solution. While the space to explore is huge, exact lower bound cuts are used to avoid exploring regions which are guaranteed to not contain an optimal point. This makes the process both fast and exhaustive. This case is called mono-parameter, where all parameters are shared by the whole computation graph.

In more complex cases, the optimizer iteratively performs an exhaustive search, with lower bound cuts in a wide subspace of the full parameter space, until it converges to a locally optimal solution. Since the wide subspace is large and multi-dimensional, it should not be trapped in a poor locally optimal solution. The more complex case is called multi-parameter, where different calculus operations have tailored parameters.

How can I determine, understand, and explore crypto-parameters

Citing

If you use this tool in your work, please cite:

Bergerat, Loris and Boudi, Anas and Bourgerie, Quentin and Chillotti, Ilaria and Ligier, Damien and Orfila Jean-Baptiste and Tap, Samuel, Parameter Optimization and Larger Precision for (T)FHE, Journal of Cryptology, 2023, Volume 36

Benchmarking

This document gives an overview of the benchmarking infrastructure of Concrete.

Concrete Python

How to run all benchmarks?

Use the makefile target:

Note that this command removes the previous benchmark results before doing the benchmark.

How to run a single benchmark?

Since the full benchmark suite takes a long time to run, it's not recommended for development. Instead, use the following command to run just a single benchmark.

This command would only run the benchmarks defined in benchmarks/foo.py. It also retains the previous runs, so it can be run back to back to collect data from multiple benchmarks.

How to add new benchmarks?

Simply add a new Python script in benchmarks directory and write your logic.

The recommended file structure is as follows:

Feel free to check benchmarks/primitive.py to see this structure in action.

SDFG dialect

Dialect for the construction of static data flow graphs A dialect for the construction of static data flow graphs. The data flow graph is composed of a set of processes, connected through data streams. Special streams allow for data to be injected into and to be retrieved from the data flow graph.

Operation definition

SDFG.get (::mlir::concretelang::SDFG::Get)

Retrieves a data element from a stream

Retrieves a single data element from the specified stream (i.e., an instance of the element type of the stream).

Example:

Operands:

Results:

SDFG.init (::mlir::concretelang::SDFG::Init)

Initializes the streaming framework

Initializes the streaming framework. This operation must be performed before control reaches any other operation from the dialect.

Example:

Results:

SDFG.make_process (::mlir::concretelang::SDFG::MakeProcess)

Creates a new SDFG process

Creates a new SDFG process and connects it to the input and output streams.

Example:

Attributes:

Operands:

SDFG.make_stream (::mlir::concretelang::SDFG::MakeStream)

Returns a new SDFG stream

Returns a new SDFG stream, transporting data either between processes on the device, from the host to the device or from the device to the host. All streams are typed, allowing data to be read / written through SDFG.get and SDFG.put only using the stream's type.

Example:

Attributes:

Operands:

Results:

SDFG.put (::mlir::concretelang::SDFG::Put)

Writes a data element to a stream

Writes the input operand to the specified stream. The operand's type must meet the element type of the stream.

Example:

Operands:

SDFG.shutdown (::mlir::concretelang::SDFG::Shutdown)

Shuts down the streaming framework

Shuts down the streaming framework. This operation must be performed after any other operation from the dialect.

Example:

Operands:

SDFG.start (::mlir::concretelang::SDFG::Start)

Finalizes the creation of an SDFG and starts execution of its processes

Finalizes the creation of an SDFG and starts execution of its processes. Any creation of streams and processes must take place before control reaches this operation.

Example:

Operands:

Attribute definition

ProcessKindAttr

Process kind

Syntax:

Parameters:

StreamKindAttr

Stream kind

Syntax:

Parameters:

Type definition

DFGType

An SDFG data flow graph

Syntax: !SDFG.dfg

A handle to an SDFG data flow graph

StreamType

An SDFG data stream

An SDFG stream to connect SDFG processes.

Parameters:

Runtime dialect

Runtime dialect A dialect for representation the abstraction needed for the runtime.

Operation definition

RT.await_future (::mlir::concretelang::RT::AwaitFutureOp)

Wait for a future and access its data.

The results of a dataflow task are always futures which could be further used as inputs to subsequent tasks. When the result of a task is needed in the outer execution context, the result future needs to be synchronized and its data accessed using RT.await_future.

Operands:

Results:

RT.build_return_ptr_placeholder (::mlir::concretelang::RT::BuildReturnPtrPlaceholderOp)

Results:

RT.clone_future (::mlir::concretelang::RT::CloneFutureOp)

Interfaces: AllocationOpInterface, MemoryEffectOpInterface

Operands:

Results:

RT.create_async_task (::mlir::concretelang::RT::CreateAsyncTaskOp)

Create a dataflow task.

Attributes:

Operands:

RT.dataflow_task (::mlir::concretelang::RT::DataflowTaskOp)

Dataflow task operation

RT.dataflow_task allows to specify a task that will be concurrently executed when their operands are ready. Operands are either the results of computation in other RT.dataflow_task (dataflow dependences) or obtained from the execution context (immediate operands). Operands are synchronized using futures and, in the case of immediate operands, copied when the task is created. Caution is required when the operand is a pointer as no deep copy will occur.

Example:

Traits: AutomaticAllocationScope, SingleBlockImplicitTerminator

Interfaces: AllocationOpInterface, MemoryEffectOpInterface, RegionBranchOpInterface

Operands:

Results:

RT.dataflow_yield (::mlir::concretelang::RT::DataflowYieldOp)

Dataflow yield operation

RT.dataflow_yield is a special terminator operation for blocks inside the region in RT.dataflow_task. It allows to specify the return values of a RT.dataflow_task.

Example:

Traits: ReturnLike, Terminator

Operands:

RT.deallocate_future_data (::mlir::concretelang::RT::DeallocateFutureDataOp)

Operands:

RT.deallocate_future (::mlir::concretelang::RT::DeallocateFutureOp)

Operands:

RT.deref_return_ptr_placeholder (::mlir::concretelang::RT::DerefReturnPtrPlaceholderOp)

Operands:

Results:

RT.deref_work_function_argument_ptr_placeholder (::mlir::concretelang::RT::DerefWorkFunctionArgumentPtrPlaceholderOp)

Operands:

Results:

RT.make_ready_future (::mlir::concretelang::RT::MakeReadyFutureOp)

Build a ready future.

Data passed to dataflow tasks must be encapsulated in futures, including immediate operands. These must be converted into futures using RT.make_ready_future.

Interfaces: AllocationOpInterface, MemoryEffectOpInterface

Operands:

Results:

RT.register_task_work_function (::mlir::concretelang::RT::RegisterTaskWorkFunctionOp)

Register the task work-function with the runtime system.

Operands:

RT.work_function_return (::mlir::concretelang::RT::WorkFunctionReturnOp)

Operands:

Type definition

FutureType

Future with a parameterized element type

The value of a !RT.future type represents the result of an asynchronous operation.

Examples:

Parameters:

PointerType

Pointer to a parameterized element type

Parameters:

To select secure cryptographic parameters for usage in Concrete, we utilize the . In particular, we use the following workflow:

this will compare the four curves generated above against the output of the version of the lattice estimator found in the .

To look at the raw data gathered in step 1., we can look in the . These objects can be loaded in the following way using SageMath:

To view the interpolated curves we load the verified_curves.sobj object inside the .

Initial comparison is as well, which is already very expensive.

The computation is guaranteed to be secure with the given level of security (see for details) which is typically 128 bits. The correctness of the computation is guaranteed up to a given failure probability. A surrogate of the execution time is minimized which allows for efficient FHE computation.

An independent public research tool, the , is used to estimate the security level. The lattice estimator is maintained by FHE experts. For a given set of crypto-parameters, this tool considers all possible attacks and returns a security level.

For each security level, a parameter curve of the appropriate minimal error level is pre-computed using the lattice estimator, and is used as an input to the optimizer. Learn more about the parameter curves .

One can have a look at for each security level (but for a given correctness). This provides insight between the calcululs content (i.e. maximum precision, maximum dot 2-norm, etc.,) and the cost.

Then one can manually explore crypto-parameters space using a .

A pre-print is available as Cryptology ePrint Archive

Concrete Python uses to do benchmarks. Please refer to its to learn how it works.

Operand
Description
Result
Description
Result
Description
Attribute
MLIR Type
Description
Operand
Description
Attribute
MLIR Type
Description
Operand
Description
Result
Description
Operand
Description
Operand
Description
Operand
Description
Parameter
C++ type
Description
Parameter
C++ type
Description
Parameter
C++ type
Description
Operand
Description
Result
Description
Result
Description
Operand
Description
Result
Description
Attribute
MLIR Type
Description
Operand
Description
Operand
Description
Result
Description
Operand
Description
Operand
Description
Operand
Description
Operand
Description
Result
Description
Operand
Description
Result
Description
Operand
Description
Result
Description
Operand
Description
Operand
Description
Parameter
C++ type
Description
Parameter
C++ type
Description
Lattice-Estimator
third_party folder
sage-object folder
sage-object folder
chunked
make benchmark
TARGET=foo make benchmark-target
# import progress tracker
import py_progress_tracker as progress

# import any other dependencies
from concrete import fhe

# create a list of targets to benchmark
targets = [
    {
        "id": (
            f"name-of-the-benchmark :: "
            f"parameter1 = {foo} | parameter2 = {bar}"
        ),
        "name": (
            f"Name of the benchmark with parameter1 of {foo} and parameter2 of {bar}"
        ),
        "parameters": {
            "parameter1": foo,
            "parameter2": bar,
        },
    }
]

# write the benchmark logic
@progress.track(targets)
def main(parameter1, parameter2):
    ...

    # to track timings
    with progress.measure(id="some-metric-ms", label="Some metric (ms)"):
        # execution time of this block will be measured
        ...

    ...

    # to track values
    progress.measure(id="another-metric", label="Another metric", value=some_metric)

    ...
"SDFG.get" (%stream) : (!SDFG.stream<1024xi64>) -> (tensor<1024xi64>)

stream

An SDFG data stream

data

any type

"SDFG.init" : () -> !SDFG.dfg

«unnamed»

An SDFG data flow graph

%in0 = "SDFG.make_stream" { type = #SDFG.stream_kind<host_to_device> }(%dfg) : (!SDFG.dfg) -> !SDFG.stream<tensor<1024xi64>>
%in1 = "SDFG.make_stream" { type = #SDFG.stream_kind<host_to_device> }(%dfg) : (!SDFG.dfg) -> !SDFG.stream<tensor<1024xi64>>
%out = "SDFG.make_stream" { type = #SDFG.stream_kind<device_to_host> }(%dfg) : (!SDFG.dfg) -> !SDFG.stream<tensor<1024xi64>>
"SDFG.make_process" { type = #SDFG.process_kind<add_eint> }(%dfg, %in0, %in1, %out) :
  (!SDFG.dfg, !SDFG.stream<tensor<1024xi64>>, !SDFG.stream<tensor<1024xi64>>, !SDFG.stream<tensor<1024xi64>>) -> ()

type

::mlir::concretelang::SDFG::ProcessKindAttr

Process kind

dfg

An SDFG data flow graph

streams

An SDFG data stream

"SDFG.make_stream" { name = "stream", type = #SDFG.stream_kind<host_to_device> }(%dfg)
  : (!SDFG.dfg) -> !SDFG.stream<tensor<1024xi64>>

name

::mlir::StringAttr

string attribute

type

::mlir::concretelang::SDFG::StreamKindAttr

Stream kind

dfg

An SDFG data flow graph

«unnamed»

An SDFG data stream

"SDFG.put" (%stream, %data) : (!SDFG.stream<1024xi64>, tensor<1024xi64>) -> ()

stream

An SDFG data stream

data

any type

"SDFG.shutdown" (%dfg) : !SDFG.dfg

dfg

An SDFG data flow graph

"SDFG.start"(%dfg) : !SDFG.dfg

dfg

An SDFG data flow graph

#SDFG.process_kind<
  ::mlir::concretelang::SDFG::ProcessKind   # value
>

value

::mlir::concretelang::SDFG::ProcessKind

an enum of type ProcessKind

#SDFG.stream_kind<
  ::mlir::concretelang::SDFG::StreamKind   # value
>

value

::mlir::concretelang::SDFG::StreamKind

an enum of type StreamKind

elementType

Type

input

Future with a parameterized element type

output

any type

output

Pointer to a parameterized element type

input

Future with a parameterized element type

output

Future with a parameterized element type

workfn

::mlir::SymbolRefAttr

symbol reference attribute

list

any type

func @test(%0 : i64): (i64, i64) {
    // Execute right now as %0 is ready.
    %1, %2 = "RT.dataflow_task"(%0) ({
        %a = addi %0, %0 : i64
        %b = muli %0, %0 : i64
        "RT.dataflow_yield"(%a, %b) : (i64, i64) -> i64
    }) : (i64, i64) -> (i64, i64)
    // Concurrently execute both tasks below when the task above is completed.
    %3 = "RT.dataflow_task"(%1) ({
        %c = constant 1 : %i64
        %a = addi %1, %c : i64
        "RT.dataflow_yield"(%a) : (i64, i64) -> i64
    }) : (i64, i64) -> (i64, i64)
    %4 = "RT.dataflow_task"(%2) ({
        %c = constant 2 : %i64
        %a = addi %2, %c : i64
        "RT.dataflow_yield"(%a) : (i64, i64) -> i64
    }) : (i64, i64) -> (i64, i64)
    return %3, %4 : (i64, i64)
}

inputs

any type

outputs

any type

%0 = constant 1 : i64
%1 = constant 2 : i64
"RT.dataflow_yield" %0, %1 : i64, i64

values

any type

input

Future with a parameterized element type

input

any type

input

Pointer to a parameterized element type

output

Future with a parameterized element type

input

Pointer to a parameterized element type

output

any type

input

any type

memrefCloned

any type

output

Future with a parameterized element type

list

any type

in

any type

out

any type

!RT.future<i64>

elementType

Type

elementType

Type

here
lattice estimator
here
reference crypto-parameters
CLI tool
Paper 2022/704
progress-tracker-python
README

Examples

This document gives an overview of the structure of the examples, which are tutorials containing more or less elaborated usages of Concrete, to showcase its functionality on practical use cases. Examples are either provided as a Python script or a Jupyter notebook.

Concrete Python

How to create an example?

Jupyter notebook example

  • Create examples/foo/foo.ipynb

  • Write the example in the notebook

  • The notebook will be executed in the CI with make test-notebooks target

Python script example

  • Create examples/foo/foo.py

  • Write the example in the script

    • Example should contain a class called Foo

    • Foo should have the following arguments in its __init__:

      • configuration: Optional[fhe.Configuration] = None

      • compiled: bool = True

    • It should compile the circuit with an appropriate inputset using the given configuration if compiled is true

    • It should have any additional common utilities (e.g., encoding/decoding) shared between the tests and the benchmarks

  • Then, add tests for the implementation in tests/execution/test_examples.py

Optionally, create benchmarks/foo.py and .

add benchmarks