Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
This document provides clear definitions of key concepts used in Concrete framework.
Computation graph
A data structure to represent a computation. It takes the form of a directed acyclic graph where nodes represent inputs, constants, or operations.
Tracing
A method that takes a Python function provided by the user and generates a corresponding computation graph.
Bounds
The minimum and the maximum value that each node in the computation graph can take. Bounds are used to determine the appropriate data type (for example, uint3 or int5) for each node before the computation graphs are converted to MLIR. Concrete simulates the graph with the inputs in the inputset to record the minimum and the maximum value for each node.
Circuit
The result of compilation. A circuit includes both client and server components. It has methods for various operations, such as printing and evaluation.
Table Lookup (TLU)
TLU stands for instructions in the form of y = T[i]. In FHE, this operation is performed with Programmable Bootstrapping, which is the equivalent operation on encrypted values. To learn more about TLU, refer to the Table Lookup basic and tge Table Lookup advanced section.
Programmable Bootstrapping (PBS)
PBS is equivalent to table lookup y = T[i]
on encrypted values, which means that the inputs i
and the outputs y
are encrypted, but the table T
is not encrypted. You can find a more detailed explanation in the FHE Overview.
TFHE
TFHE is a Fully Homomorphic Encryption (FHE) scheme that allows you to perform computations over encrypted data. For in-depth explanation of the TFHE scheme, read our blog post series TFHE Deep Dive.
This document covers how to compute on encrypted data homomorphically using the Concrete framework. We will walk you through a complete example step-by-step.
The basic workflow of computation is as follows:
Define the function you want to compute
Compile the function into a Concrete Circuit
Use the Circuit
to perform homomorphic evaluation
Here is the complete example, which we will explain step by step in the following paragraphs.
Another simple way to compile a function is to use a decorator.
This decorator is a way to add the compile
method to the function object without changing its name elsewhere.
Import the fhe
module, which includes everything you need to perform homomorphic evaluation:
Here we define a simple addition function:
To compile the function, you first need to create a Compiler
by specifying the function to compile and the encryption status of its inputs:
For instance, to set the input y as clear:
An inputset is a collection representing the typical inputs of the function. It is used to determine the bit widths and shapes of the variables within the function.
The inputset should be an iterable that yields tuples of the same length as the number of arguments of the compiled function.
For example:
Here, our inputset consists of 10 integer pairs, ranging from a minimum of (0, 0)
to a maximum of (7, 7)
.
Choosing a representative inputset is critical to allow the compiler to find accurate bounds of all the intermediate values (see more details here). Evaluating the circuit with input values under or over the bounds may result in undefined behavior.
You can use the fhe.inputset(...)
function to easily create random inputsets, see more details in this documentation.
Use the compile
method of the Compiler
class with an inputset to perform the compilation and get the resulting circuit:
Use the keygen
method of the Circuit
class to generate the keys (public and private):
If you don't call the key generation explicitly, keys will be generated lazily when needed.
Now you can easily perform the homomorphic evaluation using the encrypt
, run
and decrypt
methods of the Circuit
:
Zama 5-Question Developer Survey
We want to hear from you! Take 1 minute to share your thoughts and helping us enhance our documentation and libraries. 👉 Click here to participate.
In this document, we give a quick overview of the philosophy behind Concrete.
Concrete is a compiler, which aims to turn Python code into its FHE equivalent, in a process which is called the FHE compilation. The best efforts were made to simplify the process: in particular, exceptions apart, the same functions than the Python users are used to use are available. More complete list of available functions is given in the reference section.
Basically, in the compiled circuit, there will be two kind of operations:
levelled operations, which are the additions, subtractions or multiplications by a constant; these operations are also called the linear operations
Table Lookup (TLU) operations, which are used to do anything which is not linear.
TLU operations are essential to be able to compile complex functions. We explain their use in different sections of the documentation: direct TLU use or internal use to replace some non-linear functions. We have tools in Concrete to replace univariate or multivariate non-linear functions (ie, functions of one or more inputs) by TLU.
TLU are more costly that levelled operations, so we also explain how to limit their impact.
Remark that matrix multiplication (aka Gemm -- General Matrix multiplication) and convolutions are levelled operations, since they imply only additions and multiplications by constant.
Functions can't use conditional branches or non-constant-size loops, unless modules are used. However, control flow statements with constant values are allowed, for example, for i in range(SOME_CONSTANT)
, if os.environ.get("SOME_FEATURE") == "ON":
.
In Concrete, everything needs to be an integer. Users needing floats can quantize to integers before encryption, operate on integers and dequantize to floats after decryption: all of this is done for the user in Concrete ML. However, you can have floating-point intermediate values as long as they can be converted to an integer Table Lookup, for example, (60 * np.sin(x)).astype(np.int64)
.
Functions can use scalar and tensors. As with Python, it is prefered to use tensorization, to make computations faster.
Inputs of a compiled function can be either encrypted or clear. Use of clear inputs is however quite limited. Remark that constants can appear in the program without much constraints, they are different from clear inputs which are dynamic.
Bit width of encrypted values has a limit. We are constantly working on increasing the bit width limit. Exceeding this limit will trigger an error.
This document introduces the concept of Table Lookups (TLUs) in Concrete, covers the basic TLU usage, performance considerations, and some basic techniques for optimizing TLUs in encrypted computations. For more advanced TLU usage, refer to the Table Lookup advanced section
In TFHE, there exists mainly two operations: the linear operations, such as additions, subtractions, multiplications by integer, and the non-linear operations. Non-linear operations are achieved with Table Lookups (TLUs).
When using TLUs in Concrete, the most crucial factor for speed is the bit-width of the TLU. The smaller the bit width, the faster the corresponding FHE operation. Therefore, you should reduce the size of inputs to the lookup tables whenever possible. At the end of this document, we discuss methods for truncating or rounding entries to decrease the effective input size, further improving TLU performance.
A direct TLU performs operations in the form of y = T[i]
, where T
is a table and i
is an index. You can define the table using fhe.LookupTable
and apply it to scalars or tensors.
The LookupTable
behaves like Python's array indexing, where negative indices access elements from the end of the table.
A multi TLU is used to apply different elements of the input to different tables (e.g., square the first column, cube the second column):
In many cases, you won't need to define your own TLUs, as Concrete will set them for you.
Note that this kind of TLU is compatible with the TLU options, particularly with rounding and truncating which are explained below.
fhe.univariate and fhe.multivariate extensions are convenient ways to perform more complex operations as transparent TLUs.
Reducing the bit size of TLU inputs is essential for execution efficiency, as mentioned in the previous performance section. One effective method is to replace the table lookup y = T[i]
by some y = T'[i']
, where i'
only has the most significant bits of i
and T'
is a much shorter table. This approach can significantly speed up the TLU while maintaining acceptable accuracy in many applications, such as machine learning.
In this section, we introduce two basic techniques: truncating or rounding. You can find more in-depth explanation and other advanced techniques of optimization in the TLU advanced documentation.
The first option is to set i'
as the truncation of i
. In this method, we just take the most significant bits of i
. This is done with fhe.truncate_bit_pattern
.
The second option is to set i
as the rounded value of i
. In this method, we take the most significant bits of i
and round up by 1 if the most significant ignored bit is 1. This is done with fhe.round_bit_pattern
.
However, this approach can be slightly more complex, as rounding might result in an index that exceeds the original table's bounds. To handle this, we expand the original table by one additional index:
For further optimizations, the fhe.round_bit_pattern
function has an exactness=fhe.Exactness.APPROXIMATE
option, which allows for faster computations at the cost of minor differences between cleartext and encrypted results:
Zama 5-Question Developer Survey
We want to hear from you! Take 1 minute to share your thoughts and helping us enhance our documentation and libraries. 👉 Click here to participate.
Concrete is an open-source FHE Compiler that simplifies the use of Fully Homomorphic Encryption (FHE).
Learn the basics of Concrete, set it up, and make it run with ease.
Start building with Concrete by exploring its core features, discovering essential guides, and learning more with step-by-step tutorials.
Access to additional resources and join the Zama community.
Refer to the API, review product architecture, and access additional resources for in-depth explanations while working with Concrete.
Ask technical questions and discuss with the community. Our team of experts usually answers within 24 hours in working days.
Collaborate with us to advance the FHE spaces and drive innovation together.
Zama 5-Question Developer Survey
We want to hear from you! Take 1 minute to share your thoughts and helping us enhance our documentation and libraries. 👉 Click here to participate.
Concrete is an open source framework that simplifies the use of Fully Homomorphic Encryption (FHE).
FHE is a powerful technology that enables computations on encrypted data without needing to decrypt it. This capability ensures user privacy and provides robust protection against data breaches, as operations are performed on encrypted data, keeping sensitive information secure even if the server is compromised.
The Concrete framework makes writing FHE programs easy for developers by incorporating a Fully Homomorphic Encryption over the Torus (TFHE) Compiler based on LLVM.
Concrete enables developers to efficiently develop privacy-preserving applications for various use cases. For instance, Concrete ML is built on top of Concrete to integrate privacy-preserving features of FHE into machine learning use cases.
This document explains the steps to install Concrete into your project.
Concrete is natively supported on Linux and macOS from Python 3.8 to 3.12 inclusive. If you have Docker in your platform, you can use the docker image to use Concrete.
Install Concrete from PyPI using the following commands:
Not all versions are available on PyPI. If you need a version that is not on PyPI (including nightly releases), you can install it from our package index by adding --extra-index-url https://pypi.zama.ai/cpu/
. GPU wheels are also available under https://pypi.zama.ai/gpu/
(check https://pypi.zama.ai/
for all available platforms).
To enable all the optional features, install the full
version of Concrete:
Not all versions are available on PyPI. If you need a version that is not on PyPI (including nightly releases), you can install it from our package index by adding --extra-index-url https://pypi.zama.ai/cpu.
In particular, wheels with GPU support are not on PyPI. You can install it from our package index by adding --extra-index-url https://pypi.zama.ai/gpu, more information on GPU wheels here.
The full version requires pygraphviz, which depends on graphviz. Make sure to install all the dependencies on your operating system before installing concrete-python[full]
.
Installing pygraphviz
on macOS can be problematic (see more details here).
If you're using homebrew, you can try the following way:
before running:
You can also get the Concrete docker image. Replace v2.4.0
below by the version you want to install:
Docker is not supported on Apple Silicon.
This document explains how to combine compiled functions in Concrete, focusing on scenarios where multiple functions need to work together seamlessly. The goal is to ensure that outputs from certain functions can be used as inputs for others without decryption, including in recursive functions.
Concrete offers two methods to achieve this:
Using the composable
flag: This method is suitable when there is a single function. The composable flag allows the function to be compiled in a way that its output can be used as input for subsequent operations. For more details, refer to the composition documentation.
Using Concrete modules: This method is ideal when dealing with multiple functions or when more control is needed over how outputs are reused as inputs. Concrete modules allow you to specify precisely how functions interact. For further information, see the modules documentation.
This document provides an overview of the bit extraction feature in Concrete, including usage examples, limitations, and performance considerations.
Bit extraction could be useful in some applications that require directly manipulating bits of integers. Bit extraction allows you to extract a specific slice of bits from an integer, where index 0 corresponds to the least significant bit (LSB). The cost of this operation increases with the index of the highest significant bit you wish to extract.
Bit extraction only works in the Native
encoding, which is usually selected when all table lookups in the circuit are less than or equal to 8 bits.
You can use slices for indexing fhe.bits(value)
:
Bit extraction supports slices with negative steps:
Bit extraction supports signed integers:
Here's a practical example that uses bit extraction to determine if a number is even:
It prints:
Negative indexing is not supported: Bits extraction using negative indices is not supported, such as fhe.bits(x)[-1]
.
This is because the bit-width of x
is unknown before inputset evaluation, making it impossible to determine the correct bit to extract.
Reverse slicing requires explicit starting bit: When extracting bits in reverse order (using a negative step), the start bit must be specified, for example, fhe.bits(x)[::-1]
is not supported.
Signed integer slicing requires explicit stopping bit: For signed integers, when using slices, the stop bit must be explicitly provided, for example, fhe.bits(x)[1:]
is not supported.
Float bit extraction is not supported: While Concrete supports floats to some extent, bit extraction is not possible on float types.
Extracting a specific bit requires clearing all the preceding lower bits. This involves extracting these previous bits as intermediate values and then subtracting them from the input.
Implications:
Bits are extracted sequentially, starting from the least significant bit to the more significant ones. The cost is proportional to the index of the highest extracted bit plus one.
No parallelization is possible. The computation time is proportional to the cost, independent of the number of CPUs.
Examples:
Extracting fhe.bits(x)[4]
is approximately five times costlier than extracting fhe.bits(x)[0]
.
Extracting fhe.bits(x)[4]
takes around five times more wall clock time than fhe.bits(x)[0]
.
The cost of extracting fhe.bits(x)[0:5]
is almost the same as that of fhe.bits(x)[5]
.
Common sub-expression elimination is applied to intermediate extracted bits.
Implications:
The overall cost for a series of fhe.bits(x)[m:n]
calls on the same input x
is almost equivalent to the cost of the single most computationally expensive extraction in the series, i.e. fhe.bits(x)[n]
.
The order of extraction in that series does not affect the overall cost.
Example:
The combined operation fhe.bit(x)[3] + fhe.bit(x)[2] + fhe.bit(x)[1]
has almost the same cost as fhe.bits(x)[3]
.
Each extracted bit incurs a cost of approximately one TLU of 1-bit input precision. Therefore, fhe.bits(x)[0]
is generally faster than any other TLU operation.
This document introduces several common techniques for optimizing code to fit Fully Homomorphic Encryption (FHE) constraints. The examples provided demonstrate various workarounds and performance optimizations that you can implement while working with the Concrete library.
All code snippets provided here are temporary workarounds. In future versions of Concrete, some functions described here could be directly available in a more generic and efficient form. These code snippets are coming from support answers in our community forum
This example demonstrates how to retrieve a value from an array using an encrypted index. The method creates a "selection" array filled with 0
s except for the requested index, which will be 1
. It then sums the products of all array values with this selection array:
This example filters an encrypted array with an encrypted condition, in this case a greater than
comparison with an encrypted value. It packs all values with a selection bit that results from the comparison, allowing the unpacking of only the filtered values:
This example introduces a key concept when using Concrete: maximizing parallelization. Instead of sequentially summing all values to compute a mean, the values are split into sub-groups, and the mean of these sub-group means is computed:
This document introduces the usages and optimization strategies of non-linear operations in Concrete, focusing on comparisons, min/max operations, bitwise operations, and shifts. For a more in-depth explanation on advanced options, refer to the Table Lookup advanced documentation.
In Concrete, there are two types of operations:
Linear operations: These include additions, subtractions, and multiplications by an integer. They are computationally fast.
Non-linear operations: These require Table Lookups (TLUs) to maintain the semantic integrity of the user's program. The performance of TLUs is slower and vary depending on the bit width of the inputs.
Binary operations often require operands to have matching bit widths. This adjustment can be achieved in two ways: either directly within the MLIR or dynamically at execution time using a TLU. Each method has its own advantages and trade-offs, so Concrete provides multiple configuration options for non-linear functions.
MLIR adjustment: This method doesn't require an expensive TLU. However, it may affect other parts of your program if the adjusted operand is used elsewhere, potentially causing more changes.
Dynamic adjustment with TLU: This method is more localized and won’t impact other parts of your program, but it’s more expensive due to the cost of using a TLU.
In the following non-linear operations, we propose a certain number of configurations, using the two methods on the different operands. It’s not always clear which option will be the fastest, so we recommend trying out different configurations to see what works best for your circuit.
Note that you have the option to set show_mlir=True
to view how the MLIR handles TLUs and bit width changes. However, it's not essential to understand these details. So we recommend just testing the configurations and pick the one that performs best for your case.
For comparison, there are 7 available methods. Here's the general principle:
The config
can be one of the following:
fhe.ComparisonStrategy.CHUNKED
fhe.ComparisonStrategy.ONE_TLU_PROMOTED
fhe.ComparisonStrategy.THREE_TLU_CASTED
fhe.ComparisonStrategy.TWO_TLU_BIGGER_PROMOTED_SMALLER_CASTED
fhe.ComparisonStrategy.TWO_TLU_BIGGER_CASTED_SMALLER_PROMOTED
fhe.ComparisonStrategy.THREE_TLU_BIGGER_CLIPPED_SMALLER_CASTED
fhe.ComparisonStrategy.TWO_TLU_BIGGER_CLIPPED_SMALLER_PROMOTED
For min / max operations, there are 3 available methods. Here's the general principle:
The config
can be one of the following:
fhe.MinMaxStrategy.CHUNKED
(default)
fhe.MinMaxStrategy.ONE_TLU_PROMOTED
fhe.MinMaxStrategy.THREE_TLU_CASTED
For bit wise operations (typically, AND, OR, XOR), there are 5 available methods. Here's the general principle:
The config
can be one of the following:
fhe.BitwiseStrategy.CHUNKED
fhe.BitwiseStrategy.ONE_TLU_PROMOTED
fhe.BitwiseStrategy.THREE_TLU_CASTED
fhe.BitwiseStrategy.TWO_TLU_BIGGER_PROMOTED_SMALLER_CASTED
fhe.BitwiseStrategy.TWO_TLU_BIGGER_CASTED_SMALLER_PROMOTED
For shift operations, there are 2 available methods. Here's the general principle:
The shifts_with_promotion
is either True
or False
.
fhe.multivariate
All binary operations described in this document can also be implemented with the fhe.multivariate
function which is described in fhe.multivariate function documentation. Here's an example:
This document explains the multi-precision option for bit-width assignment for integers.
The multi-precision option enables the frontend to use the smallest bit-width possible for each operation in Fully Homomorphic Encryption (FHE), improving computation efficiency.
Each integer in the circuit has a certain bit-width, which is determined by the input-set. These bit-widths are visible when graphs are printed, for example:
However, adding integers with different bit-widths (for example, 3-bit and 4-bit numbers) directly isn't possible due to differences in encoding, as shown below:
When you add a 3-bit number and a 4-bit number, the result is a 5-bit number with a different encoding:
To address these encoding differences, a graph processing step called bit-width assignment is performed. This step updates the graph's bit-widths to ensure compatibility with Fully Homomorphic Encryption (FHE).
After this step, the graph might look like this:
Most operations cannot change the encoding, requiring the input and output bit-widths to remain the same. However, the table lookup operation can change the encoding. For example, consider the following graph:
This graph represents the computation (x**2) + y
where x
is 2-bits and y
is 5-bits. Without the ability to change encodings, all bit-widths would need to be adjusted to 6-bits. However, since the encoding can change, bit-widths are assigned more efficiently:
In this case, x
remains a 2-bit integer, but the Table Lookup result and y
are set to 6-bits to allow for the addition.
This approach to bit-width assignment is known as multi-precision and is enabled by default. To disable multi-precision and enforce a single precision across the circuit, use the single_precision=True
configuration option.
This document explains how to combine compiled functions with the composable
flag in Concrete.
By setting the composable
flag to True
, you can compile a function such that its outputs can be reused as inputs. For example, you can then easily compute f(f(x))
or f**i(x) = f(f(...(f(x) ..))
for a non-encrypted integer i
variable, which is usually required for recursions.
Here is an example:
Remark that this option is the equivalent to using the fhe.AllComposable
policy of modules. In particular, the same limitations may occur (see limitations documentation section).
This document explains the implications and configuration of multi parameters in Concrete.
In Concrete, integers are encrypted and processed based on a set of cryptographic parameters. By default, the Concrete optimizer selects multiple sets of these parameters, which may not be optimal for every use case. In such cases, you can choose to use mono parameters instead.
When multi parameters are enabled, the optimizer selects a different set of parameters for each bit-width in the circuit. This approach has several implications:
Faster execution in general
Slower key generation
Larger keys
Larger memory usage during execution
When enabled, you can control the level of circuit partitioning by setting the multi_parameter_strategy as described in configuration.
To disable multi parameters, use parameter_selection_strategy=fhe.ParameterSelectionStrategy.MONO
configuration option.
This document explains the compression feature in Concrete and its performance impact.
Fully Homomorphic Encryption (FHE) needs both ciphertexts (encrypted data) and evaluation keys to carry out the homomorphic evaluation of a function. Both elements are large, which may critically affect the application's performance depending on the use case, application deployment, and the method for transmitting and storing ciphertexts and evaluation keys.
During compilation, you can enable compression options to enforce the use of compression features. The two available compression options are:
compress_evaluation_keys: bool = False,
This specifies that serialization takes the compressed form of evaluation keys.
compress_input_ciphertexts: bool = False,
This specifies that serialization takes the compressed form of input ciphertexts.
You can see the impact of compression by comparing the size of the serialized form of input ciphertexts and evaluation keys with a sample code:
The compression factor largely depends on the cryptographic parameters identified and the compression algorithms selected during the compilation.
Currently, Concrete uses the seeded compression algorithms. These algorithms rely on the fact that CSPRNGs are deterministic. Consequently, the chain of random values can be replaced by the seed and later recalculated using the same seed.
Typically, the size of a ciphertext is (lwe dimension + 1) * 8
bytes, while the size of a seeded ciphertext is constant, equal to 3 * 8
bytes. Thus, the compression factor ranges from a hundred to thousands. Understanding the compression factor of evaluation keys is complex. The compression factor of evaluation keys typically ranges between 0 and 10.
Please note that while compression may save bandwidth and disk space, it incurs the cost of decompression. Currently, decompression occur more or less lazily during FHE evaluation without any control.
This document explains how to reuse encrypted arguments in applications where the same arguments are used repeatedly.
Encrypting data can be resource-intensive, especially when the same argument or set of arguments is used multiple times. In such cases, it’s inefficient to encrypt and transfer the arguments repeatedly. Instead, you can encrypt the arguments separately and reuse them as needed. By encrypting the arguments once and reusing them, you can optimize performance by reducing encryption time, memory usage, and network bandwidth.
Here is an example:
Note when you use encrypt
method:
If you have multiple arguments, the encrypt
method would return a tuple
.
If you specify None
as one of the arguments, None
is placed at the same location in the resulting tuple
.
For example, circuit.encrypt(a, None, b, c, None)
returns (encrypted_a, None, encrypted_b, encrypted_c, None)
.
Each value returned by encrypt
can be stored and reused anytime.
The order of arguments must be consistent when encrypting and using them. Encrypting an x
and using it as a y
could result in undefined behavior.
This document introduces some extensions of Concrete, including functions for wrapping univariate and multivariate functions, performing convolution and maxpool operations, creating encrypted arrays, and more.
Wraps any univariate function into a single table lookup:
The wrapped function must follow these criteria:
No side effects: For example, no modification of global state
Deterministic: For example, no random number generation.
Shape consistency: output.shape
should be the same with input.shape
Element-wise mapping: Each output element must correspond to a single input element, for example. output[0]
should only depend on input[0]
of all inputs.
Violating these constraints may result in undefined outcome.
Wraps any multivariate function into a table lookup:
The wrapped functions must follow these criteria:
No side effects: For example, avoid modifying global state.
Deterministic: For example, no random number generation.
Broadcastable shapes: input.shape
should be broadcastable to output.shape
for all inputs.
Element-wise mapping: Each output element must correspond to a single input element, for example, output[0]
should only depend on input[0]
of all inputs.
Violating these constraints may result in undefined outcome.
Multivariate functions cannot be called with rounded inputs.
Perform a convolution operation, with the same semantic as onnx.Conv:
Only 2D convolutions without padding and with one group are currently supported.
Perform a maxpool operation, with the same semantic as onnx.MaxPool:
Only 2D maxpooling without padding and up to 15-bits is currently supported.
Create encrypted arrays:
Currently, only scalars can be used to create arrays.
Create an encrypted scalar zero:
Create an encrypted tensor of zeros:
Create an encrypted scalar one:
Create an encrypted tensor of ones:
Allows you to create an encrypted constant of a given value.
This extension is also compatible with constant arrays.
Hint properties of a value. Imagine you have this circuit:
You'd expect all of a
, b
, and c
to be 8-bits, but because inputset is very small, this code could print:
The first solution in these cases should be to use a bigger inputset, but it can still be tricky to solve with the inputset. That's where the hint
extension comes into play. Hints are a way to provide extra information to compilation process:
Bit-width hints are for constraining the minimum number of bits in the encoded value. If you hint a value to be 8-bits, it means it should be at least uint8
or int8
.
To fix f
using hints, you can do:
Hints are only applied to the value being hinted, and no other value. If you want the hint to be applied to multiple values, you need to hint all of them.
you'll always see:
regardless of the bounds.
Alternatively, you can use it to make sure a value can store certain integers:
Perform ReLU operation, with the same semantic as x if x >= 0 else 0
:
The ReLU operation can be implemented in two ways:
Single TLU (Table Lookup Unit) on the original bit-width: Suitable for small bit-widths, as it requires fewer resources.
Multiple TLUs on smaller bit-widths: Better for large bit-widths, avoiding the high cost of a single large TLU.
The method of conversion is controlled by the relu_on_bits_threshold: int = 7
option. For example, setting relu_on_bits_threshold=5
means:
Bit-widths from 1 to 4 will use a single TLU.
Bit-widths of 5 and above will use multiple TLUs.
Another option to fine-tune the implementation is relu_on_bits_chunk_size: int = 2
. For example, setting relu_on_bits_chunk_size=4
means that when using second implementation (using chunks), the input is split to 4-bit chunks using fhe.bits, and then the ReLU is applied to those chunks, which are then combined back.
Here is a script showing how execution cost is impacted when changing these values:
You might need to run the script twice to avoid crashing when plotting.
The script will show the following figure:
The default values of these options are set based on simple circuits. How they affect performance will depend on the circuit, so play around with them to get the most out of this extension.
Conversion with the second method (using chunks) only works in Native
encoding, which is usually selected when all table lookups in the circuit are below or equal to 8 bits.
Perform ternary if operation, with the same semantic as x if condition else y
:
fhe.if_then_else
is just an alias for np.where.
Copy the value:
The fhe.identity
extension is useful for cloning an input with a different bit-width.
Identity extension only works in Native
encoding, which is usually selected when all table lookups in the circuit are below or equal to 8 bits.
It is similar to fhe.identity
but with the extra guarantee that encryption noise is refreshed.
Refresh is useful when you want to control precisely where encryption noise is refreshed in your circuit. For instance if you are using modules, sometimes compilation rejects the module because it's not composable. This happens because a function of the module never refresh the encryption noise. Adding a return fhe.refresh(result)
on the function result solves the issue.
Refresh extension only works in Native
encoding, which is usually selected when all table lookups in the circuit are below or equal to 8 bits.
Create a random inputset with the given specifications:
The result will have 100 inputs by default which can be customized using the size keyword argument:
This document explains how to compile Fully Homomorphic Encryption (FHE) modules containing multiple functions using Concrete.
Deploying a server that contains many compatible functions is important for some use cases. With Concrete, you can compile FHE modules containing as many functions as needed.
These modules support the composition of different functions, meaning that the encrypted result of one function can be used as the input for another function without needing to decrypt it first. Additionally, a module is , making it as simple to use as a single-function project.
The following example demonstrates how to create an FHE module:
Then, to compile the Counter
module, use the compile
method with a dictionary of input-sets for each function:
After the module is compiled, you can encrypt and call the different functions as follows:
You can generate the keyset beforehand by calling keygen()
method on the compiled module:
Composition is not limited to single input / single output. Here is an example that computes the 10 first elements of the Fibonacci sequence in FHE:
Executing this script will provide the following output:
Modules support iteration with cleartext iterands to some extent, particularly for loops structured like this:
This script prints the following output:
In this example, a while loop iterates until the decrypted value equals 1. The loop body is implemented in FHE, but the iteration control must be in cleartext.
By default, when using modules, all inputs and outputs of every function are compatible, sharing the same precision and crypto-parameters. This approach applies the crypto-parameters of the most costly code path to all code paths. This simplicity may be costly and unnecessary for some use cases.
To optimize runtime, we provide finer-grained control over the composition policy via the composition
module attribute. Here is an example:
You have 3 options for the composition
attribute:
fhe.AllComposable
(default): This policy ensures that all ciphertexts used in the module are compatible. It is the least restrictive policy but the most costly in terms of performance.
fhe.NotComposable
: This policy is the most restrictive but the least costly. It is suitable when you do not need any composition and only want to pack multiple functions in a single artifact.
fhe.Wired
: This policy allows you to define custom composition rules. You can specify which outputs of a function can be forwarded to which inputs of another function.
Note that, in case of complex composition logic another option is to rely on [[composing_functions_with_modules#Automatic module tracing]] to automatically derive the composition from examples.
In this case, the policy states that the first output of the collatz
function can be forwarded to the first input of collatz
, but not the second output (which is decrypted every time, and used for control flow).
You can use the fhe.Wire
between any two functions. It is also possible to define wires with fhe.AllInputs
and fhe.AllOutputs
ends. For instance, in the previous example:
This policy would be equivalent to using the fhe.AllComposable
policy.
When a module's composition logic is static and straightforward, declaratively defining a Wired
policy is usually the simplest approach. However, in cases where modules have more complex or dynamic composition logic, deriving an accurate list of Wire
components to be used in the policy can become challenging.
Another related problem is defining different function input-sets. When the composition logic is simple, these can be provided manually. But as the composition gets more convoluted, computing a consistent ensemble of inputsets for a module may become intractable.
For those advanced cases, you can derive the composition rules and the input-sets automatically from user-provided examples. Consider the following module:
You can use the wire_pipeline
context manager to activate the module tracing functionality:
Note that any dynamic branching is possible during module tracing. However, for complex runtime logic, ensure that the input set provides sufficient examples to cover all potential code paths.
Depending on the functions, composition may add a significant overhead compared to a non-composable version.
To be composable, a function must meet the following condition: every output that can be forwarded as input (according to the composition policy) must contain a noise-refreshing operation. Since adding a noise refresh has a noticeable impact on performance, Concrete does not automatically include it.
For instance, to implement a function that doubles an encrypted value, you might write:
This function is valid with the fhe.NotComposable
policy. However, if compiled with the fhe.AllComposable
policy, it will raise a RuntimeError: Program cannot be composed: ...
, indicating that an extra Programmable Bootstrapping (PBS) step must be added.
To resolve this and make the circuit valid, add a PBS at the end of the circuit:
This document explains how to use simulation to speed up the development, enabling faster prototyping while accounting for the inherent probability of errors in Fully Homomorphic Encryption (FHE) execution.
During development, the speed of homomorphic execution can be a blocker for fast prototyping. Although you can directly call the function you want to compile, this approach does not fully replicate FHE execution, which involves a certain probability of error (see ).
To overcome this issue, simulation is introduced:
After the simulation runs, it prints the following results:
Overflow can happen during an FHE computation, leading to unexpected behaviors. Using simulation can help you detect these events by printing a warning whenever an overflow happens. This feature is disabled by default, but you can enable it by setting detect_overflow_in_simulation=True
during compilation.
To demonstrate, we will compile the previous circuit with overflow detection enabled and trigger an overflow:
You will see the following warning after the simulation call:
If you look at the MLIR (circuit.mlir
), you will see that the input type is supposed to be eint4
represented in 4 bits with a maximum value of 15. Since there's an addition of the input, we used the maximum value (15) here to trigger an overflow (15 + 1 = 16 which needs 5 bits). The warning specifies the operation that caused the overflow and its location. Similar warnings will be displayed for all basic FHE operations such as add, mul, and lookup tables.
This document explains how to use restrictions to limit the possible crypto-parameters used for the keys.
When compiling a module, the optimizer analyzes the circuits and the expected probability of error, to identify the fastest crypto-parameters that meet the specific constraints. The chosen crypto-parameters determine the size of the keys and the ciphertexts. This means that if an existing module is used in production with a specific set of crypto-parameters, there is no guarantee that a compilation of a second, different module will yield compatible crypto-parameters.
With restrictions, Concrete provides a way to ensure that a compilation generates compatible crypto-parameters. Restrictions will limit the search-space walked by the optimizer to ensure that only compatible parameters can be returned. As of now, we support two major restrictions:
: Restricts the crypto-parameters to an existing keyset.
: Restricts the crypto-parameters ranges allowed in the optimizer.
You can generate keyset restriction directly form an existing keyset:
You can build a ranges restriction by adding available values:
Note that if no available parameters are set for one of the parameter ranges (say ks_base_log
), it is assumed that the default range is available.
This document explains the most common errors and provides solutions to fix them.
Error message: Could not find a version that satisfies the requirement concrete-python (from versions: none)
Cause: The installation does not work fine for you.
Possible solutions:
Be sure that you use a supported Python version (currently from 3.8 to 3.12, included).
Check that you have done pip install -U pip wheel setuptools
before.
Consider adding a --extra-index-url https://pypi.zama.ai/cpu
or --extra-index-url https://pypi.zama.ai/gpu
, depending on whether you want the CPU or the GPU wheel.
Concrete requires glibc>=2.28, be sure to have a sufficiently recent version.
Error message: RuntimeError: Function you are trying to compile cannot be compiled
with extra information only integers are supported
Cause: Parts of your program contain graphs that are not from integer to integer
Possible solutions:
You can use floats as intermediate values (see the ). However, both inputs and outputs must be integers. Consider converting values to integers, such as .astype(np.uint64)
Error message: NoParametersFound
Cause: The optimizer can't find cryptographic parameters for the circuit that are both secure and correct.
Possible solutions:
Try to simplify your circuit.
Use smaller weights.
Add intermediate PBS to reduce the noise, with identity function fhe.refresh(lambda x: x)
.
Error message: RuntimeError: Function you are trying to compile cannot be compiled
, with extra information as this [...]-bit value is used as an input to a table lookup
with but only up to 16-bit table lookups are supported
Cause: The program uses a Table Lookup that contains oversized inputs exceeding the current 16-bit limit.
Possible solutions:
Try to simplify your circuit.
Use smaller weights.
Look to the graph to understand where this oversized input comes from and ensure that the input size for Table Lookup operations does not exceed 16 bits.
Use show_bit_width_constraints=True
to understand bit widths are assigned the way they are.
Error message: RuntimeError: A subgraph within the function you are trying to compile cannot be fused because it has multiple input nodes
Cause: A subgraph in your program uses two or more input nodes. It is impossible to fuse such a graph, meaning replace it by a table lookup. Concrete will indicate the input nodes with this is one of the input nodes printed
in the circuit.
Possible solutions:
Try to simplify your circuit.
Have a look to fhe.multivariate
.
Error message: RuntimeError: Function '[...]' is not supported
Cause: The function used is not currently supported by Concrete.
Possible solutions:
Try to change your program.
Check the corresponding documentation to see if there are ways to implement the function differently.
Error message: RuntimeError: Branching within circuits is not possible
Cause: Branching operations, such as if statements or non-constant loops, are not supported in Concrete's FHE programs.
Possible solutions:
Change your program.
Consider using tricks to replace ternary-if, as c ? t : f = f + c * (t-f)
.
Error message: Unfeasible noise constraint encountered
Cause: The optimizer can't find cryptographic parameters for the circuit that are both secure and correct.
Possible solutions:
Try to simplify your circuit.
Use smaller weights.
Add intermediate PBS to reduce the noise, with identity function fhe.refresh(x)
.
Error message: Program can not be composed
Cause: Some circuit outputs are contaminated by unrefreshed input noise.
Possible solutions:
Add intermediate PBS to refresh the noise with fhe.refresh(x)
.
Unbounded loops or complex dynamic conditions are also supported, as long as these conditions are computed in pure cleartext in Python. The following example computes the :
Post your issue in our .
This document shows some basic things you can do to improve the performance of your circuit.
Here are some quick tips to reduce the execution time of your circuit:
Reduce the amount of table lookups in the circuit.
Try different implementation strategies for complex operations.
Utilize rounding and truncating if your application doesn't require precise execution.
Use tensors as much as possible in your circuits.
Enable dataflow parallelization, by setting dataflow_parallelize=True
in the configuration.
Tweak p_error
configuration option until you get optimal exactness vs performance tradeoff for your application.
Specify composition when using modules.
You can refer to our full Optimization Guide for detailed examples of how to do each of these, and more!
This document explains how to manage keys when using Concrete, introducing the key management API for generating, reusing, and securely handling keys.
Concrete generates keys lazily when needed. While this is convenient for development, it's not ideal for the production environment. The explicit key management API is available for you to easily generate and reuse keys as needed.
Let's start by defining a circuit with the following example:
Circuits have a keys
property of type fhe.Keys
, which includes several utilities for key management.
To explicitly generate keys for a circuit, use:
Generated keys are stored in memory and remain unencrypted.
You can also set a custom seed for reproducibility:
Do not specify the seed manually in a production environment! This is not secure and should only be done for debugging purposes.
To serialize keys, for tasks such as sending them across a network, use:
Keys are not serialized in encrypted form. Please make sure you keep them in a safe environment, or encrypt them manually after serialization.
To deserialize the keys back after receiving serialized keys, use:
Once you have a valid fhe.Keys
object, you can directly assign it to the circuit:
If assigned keys are generated for a different circuit, an exception will be raised.
You can also use the filesystem to store the keys directly, without managing serialization and file management manually:
Keys are not saved in encrypted form. Please make sure you store them in a safe environment, or encrypt them manually after saving.
After saving keys to disk, you can load them back using:
If you want to generate keys in the first run and reuse the keys in consecutive runs, use:
This document provides guidance on debugging the compilation process.
Two configuration options are available to help you understand the compilation process:
compiler_verbose_mode: Prints the compiler passes and shows the transformations applied. It can help identify the crash location if a crash occurs.
compiler_debug_mode: A more detailed version of the verbose mode, providing additional information, particularly useful for diagnosing crashes.
These flags might not work as expected in Jupyter notebooks as they output to stderr
directly from C++.
Concrete includes an artifact system that simplifies the debugging process by automatically or manually exporting detailed information during compilation failures.
When a compilation fails, artifacts are automatically exported to the .artifacts
directory in the working directory. Here's an example of what gets exported when a function fails to compile:
This function fails to compile because Concrete does not support floating-point outputs. When you try to compile it, an exception will be raised and the artifacts will be exported automatically. The following files will be generated in the .artifacts
directory:
environment.txt
: Information about your system setup, including the operating system and Python version.
requirements.txt
: The installed Python packages and their versions.
function.txt
: The code of the function that failed to compile.
parameters.txt
: Information about the encryption status function's parameters.
1.initial.graph.txt
: The textual representation of the initial computation graph right after tracing.
final.graph.txt
: The textual representation of the final computation graph right before MLIR conversion.
traceback.txt
: Details of the error occurred.
Manual exports are mostly used for visualization and demonstrations. Here is how to perform one:
After running the code, you will find the following files under /tmp/custom/export/path
directory:
1.initial.graph.txt
: The textual representation of the initial computation graph right after tracing.
2.after-fusing.graph.txt
: The textual representation of the intermediate computation graph after fusing.
3.final.graph.txt
: The textual representation of the final computation graph right before MLIR conversion.
mlir.txt
: Information about the MLIR of the function which was compiled using the provided input-set.
client\_parameters.json
: Information about the client parameters chosen by Concrete.
You can seek help with your issue by asking a question directly in the community forum.
If you cannot find a solution in the community forum, or if you have found a bug in the library, you could create an issue in our GitHub repository.
For bug reports, try to:
Avoid randomness to ensure reproducibility of the bug
Minimize your function while keeping the bug to expedite the fix
Include your input-set in the issue
Provide clear reproduction steps
Include debug artifacts in the issue
For feature requests, try to:
Give a minimal example of the desired behavior
Explain your use case
This document introduces the progressbar feature that provides visual feedback on the execution progress of large circuits, which can take considerable time to execute.
The following Python code demonstrates how to enable and use the progressbar:
When you run this code, you will see a progressbar like this one:
As the execution proceeds, the progress bar updates:
The progress bar does not measure time. When it shows 50%, it indicates that half of the nodes in the computation graph have been processed, not that half of the time has elapsed. The duration of processing different node types may vary, so the progress bar should not be used to estimate the remaining time.
Once the progressbar fills and execution completes, you will see the following figure:
This document provides an overview of how to analyze compiled circuits and extract statistical data for performance evaluation in Concrete. These statistics help identify bottlenecks and compare circuits.
Concrete calculates statistics based on the following six basic operations:
Clear addition: x + y
where x
is encrypted and y
is clear
Encrypted addition: x + y
where both x
and y
are encrypted
Clear multiplication: x * y
where x
is encrypted and y
is clear
Encrypted negation: -x
where x
is encrypted
Key switch: A building block for table lookups
Packing key switch: A building block for table lookups
Programmable bootstrapping: A building block for table lookups
You can print all statistics using the show_statistics
configuration option:
This code will print:
Each of these properties can be directly accessed on the circuit (e.g., circuit.programmable_bootstrap_count
).
Imagine you have a neural network with 10 layers, each of them tagged, you can easily see the number of additions and multiplications required for matrix multiplications per layer:
This document provides instructions on how to customize the compilation pipeline using Configuration
s in Python and describes various configuration options available.
You can customize Concrete using the fhe.Configuration
:
You can overwrite individual configuration options by specifying kwargs in the compile
method:
You can also combine both ways:
When options are specified both in the configuration
and as kwargs in the compile
method, the kwargs take precedence.
To enable exact clipping,
Or/and approximate clipping, which makes overflow protection faster.
Adjust rounders automatically.
Enable auto parallelization in the compiler.
Enable or disable the debug mode of the compiler. This can show a lot of information, including passes and pattern rewrites.
Enable or disable verbose mode of the compiler. This mainly shows logs from the compiler and is less verbose than the debug mode.
Specify that serialization takes the compressed form of evaluation keys.
Specify that serialization takes the compressed form of input ciphertexts.
Specify that the function must be composable with itself.
Enable dataflow parallelization in the compiler.
Export debugging artifacts automatically on compilation failures.
Enables Table Lookups(TLU) fusing to reduce the number of TLUs.
Enable unsafe features.
Enable FHE execution. Can be enabled later using circuit.enable_fhe_execution()
.
Enable FHE simulation. Can be enabled later using circuit.enable_fhe_simulation()
.
Global error probability for the whole circuit.
Chunk size to use when converting the fhe.if_then_else extension
.
Location of insecure key cache.
Enable loop parallelization in the compiler.
Set the level of circuit partitioning when using fhe.ParameterSelectionStrategy.MULTI
.
PRECISION
: all TLUs with the same input precision have their own parameters.
Enables TLU optimizations based on measured bounds.
Not enabled by default, as it could result in unexpected overflows during runtime.
Configures whether to convert values to their original precision before doing a table lookup on them.
True
enables it for all cases.
False
disables it for all cases.
Integer value enables or disables it depending on the original bit width. With the default value of 8, only the values with original bit width ≤ 8 will be converted to their original precision.
Error probability for individual table lookups.
Set how cryptographic parameters are selected.
Enables printing of TLU fusing to see which table lookups are fused.
How many nested tag elements to display with the progress bar.
True
means all tag elements
False
disables the display.
2
will display elmt1.elmt2
.
Title of the progress bar.
Set default exactness mode for the rounding operation:
EXACT
: threshold for rounding up or down is exactly centered between the upper and lower value.
Print computation graph during compilation.
True
means always print
False
means never print
None
means print depending on verbose configuration.
Print MLIR during compilation.
True
means always print
False
means never print
None
means print depending on verbose configuration.
Print optimizer output during compilation.
True
means always print
False
means never print
None
means print depending on verbose configuration.
Display a progress bar during circuit execution.
Print circuit statistics during compilation.
True
means always print
False
means never print
None
means print depending on verbose configuration.
Whether to use the simulate encrypt/run/decrypt methods of the circuit/module instead of actual encryption/evaluation/decryption.
When this option is set to True
, encrypt and decrypt are identity functions, and run is a wrapper around simulation. In other words, this option allows switching off encryption to quickly test if a function has the expected semantic (without paying the price of FHE execution).
This is extremely unsafe and should only be used during development.
For this reason, it requires enable_unsafe_features
to be set to True
.
Use single precision for the whole circuit.
A range restriction to pass to the optimizer to restrict the available crypto-parameters.
A keyset restriction to pass to the optimizer to restrict the available crypto-parameters.
Enable generating code for GPU in the compiler.
Use the insecure key cache.
Print details related to compilation.
Enable automatic scheduling of run
method calls. When enabled, fhe function are computated in parallel in a background threads pool. When several run
are composed, they are automatically synchronized.
For now, it only works for the run
method of a FheModule
, in that case you obtain a Future[Value]
immediately instead of a Value
when computation is finished.
E.g. my_module.f3.run( my_module.f1.run(a), my_module.f1.run(b) )
will runs f1
and f2
in parallel in the background and f3
in background when both f1
and f2
intermediate results are available.
If you want to manually synchronize on the termination of a full computation, e.g. you want to return the encrypted result, you can call explicitely value.result()
to wait for the result. To simplify testing, decryption does it automatically.
Automatic scheduling behavior can be override locally by calling directly a variant of run
:
run_sync
: forces the fhe function to occur in the current thread, not in the background,
run_async
: forces the fhe function to occur in a background thread, returning immediately a Future[Value]
Set the level of security used to perform the optimization of crypto-parameters.
This document explains how to use GPU accelerations with Concrete.
Concrete supports acceleration using one or more GPUs.
This version is not available on , which only hosts wheels with CPU support.
To use GPU acceleration, install the GPU/CUDA wheel from our using the following command:
pip install concrete-python --extra-index-url https://pypi.zama.ai/gpu
.
After installing the GPU/CUDA wheel, you must the FHE program compilation to enable GPU offloading using the use_gpu
option.
Our GPU wheels are built with CUDA 11.8 and should be compatible with higher versions of CUDA.
By default the compiler and runtime will use all available system resources, including all CPU cores and GPUs. You can adjust this by using the following environment variables:
Type: Integer
Default value: The number of hardware threads on the system (including hyperthreading) minus the number of GPUs in use.
Description: This variable determines the number of CPU threads that execute in paralelle with the GPU for offloadable workloads. GPU scheduler threads (including CUDA threads and those used within Concrete) are necessary but can block or interfere with worker thread execution. Therefore, it is recommended to undersubscribe the CPU hardware threads by the number of GPU devices used.
Type: Integer
Default value: The number of GPUs available.
Description: This value determines the number of GPUs to use for offloading. This can be set to any value between 1 and the total number of GPUs on the system.
Type: Integer
Default value: LLONG_MAX (no batch size limit)
Description: This value limits the maximum batch size for offloading in cases where the GPU memory is insufficient.
Type: Integer
Default value: The ratio between the compute capability of the GPU (at index 0) and a CPU core.
Description: This ratio is used to balance the load between the CPU and GPU. If the GPU is underutilized, set this value higher to increase the amount of work offloaded to the GPU.
Type: Integer
Default value: The number of hardware threads on the system, including hyperthreading.
Description: This value specifies the portions of program execution that are not yet supported for GPU offload, which will be parallelized using OpenMP on the CPU.
This document explains how to format and draw a compiled circuit in Python.
To convert your compiled circuit into its textual representation, use the str
function:
If you just want to see the output on your terminal, you can directly print it as well:
Formatting is designed for debugging purpose only. It's not possible to create the circuit back from its textual representation. See if that's your goal.
Drawing functionality requires the installation of the package with the full feature set. See the section for instructions.
To draw your compiled circuit, use the draw
method:
This method draws the circuit, saves it as a temporary PNG file and returns the file path.
You can display the drawing in a Jupyter notebook:
Alternatively, you can use the show
option of the draw
method to display the drawing with matplotlib
:
Using this option will clear any existing matplotlib plots.
Lastly, to save the drawing to a specific path, use the save_to
option:
You can also use tags to analyze specific sections of your circuit. See more detailed explanation in .
Provide fine control for :
Specify preference for bitwise strategies, can be a single strategy or an ordered list of strategies. See to learn more.
Specify preference for comparison strategies. Can be a single strategy or an ordered list of strategies. See to learn more.
Only used when compiling a single circuit; when compiling modules, use the .
If set, the whole circuit will have the probability of a non-exact result smaller than the set value. See to learn more.
PRECISION_AND_NORM2
: all TLUs with the same input precision and output have their own parameters.
If set, all table lookups will have the probability of a non-exact result smaller than the set value. See to learn more.
APPROXIMATE
: faster but threshold for rounding up or down is approximately centered with a pseudo-random shift. Precise behavior is described in .
Chunk size of the ReLU extension when implementation is used.
Bit-width to start implementing the ReLU extension with .
Enable promotions in encrypted shifts instead of casting at runtime. See to learn more.
This document explains how to serialize and deserialize ciphertexts and secret keys when working with TFHE-rs in Rust.
Concrete already has its serilization functions (e.g. tfhers_bridge.export_value
, tfhers_bridge.import_value
, tfhers_bridge.keygen_with_initial_keys
, tfhers_bridge.serialize_input_secret_key
). However, when implementing a TFHE-rs computation in Rust, we must use a compatible serialization.
We should deserialize FheUint8
using safe serialization functions from TFHE-rs
To serialize
We should deserialize LweSecretKey
using safe serialization functions from TFHE-rs
To serialize
This feature is currently in beta version. Please note that the API may change in future Concrete releases.
This guide explains how to combine Concrete and TFHE-rs computations together. This allows you to convert ciphertexts from Concrete to TFHE-rs, and vice versa, and to run a computation with both libraries without requiring a decryption.
There are differences between Concrete and TFHE-rs, so ensuring interoperability between them involves more than just data serialization. To achieve interoperability, we need to consider two main aspects.
Both TFHE-rs and Concrete libraries use Learning with errors(LWE) ciphertexts, but integers are encoded differently:
In Concrete, integers are simply encoded in a single ciphertext
In TFHE-rs, integers are encoded into multiple ciphertext using radix decomposition
Converting between Concrete and TFHE-rs encrypted integers then require doing an encrypted conversion between the two different encodings.
When working with a TFHE-rs integer type in Concrete, you can use the .encode(...)
and .decode(...)
functions to see this in practice:
The Concrete Optimizer may find parameters which are not in TFHE-rs's pre-computed list. To ensure interoperability, you need to either fix or constrain the search space in parts of the circuit where interoperability is required. This ensures that compatible parameters are used consistently.
There are 2 different approaches to using Concrete and THFE-rs depending on the situation.
Scenario 1: Shared secret key: In this scenario, a single party aims to combine both Concrete and TFHE-rs in a computation. In this case, a shared secret key will be used, while different keysets will be held for Concrete and TFHE-rs.
Scenario 2: Pregenerated TFHE-rs keys: This scenario involves two parties, each with a pre-established set of TFHE-rs keysets. The objective is to compute on encrypted TFHE-rs data using Concrete. In this case, there is no shared secret key. The party using Concrete will rely solely on TFHE-rs public keys and must optimize the parameters accordingly, while the party using TFHE-rs handles encryption, decryption, and computation.
Concrete already has its serilization functions (such as tfhers_bridge.export_value
, tfhers_bridge.import_value
, tfhers_bridge.keygen_with_initial_keys
, tfhers_bridge.serialize_input_secret_key
, and so on). However, when implementing a TFHE-rs computation in Rust, we must use a compatible serialization. Learn more in Serialization of ciphertexts and keys.
This guide explains how to optimize Concrete circuits extensively.
It's split in 3 sections:
Improve parallelism: to make circuits utilize more cores.
Optimize table lookups: to optimize the most expensive operation in Concrete.
Optimize cryptographic parameters: to make Concrete select more performant parameters.
This guide introduces the different options for parallelism in Concrete and how to utilize them to improve the execution time of Concrete circuits.
Modern CPUs have multiple cores to perform computation and utilizing multiple cores is a great way to boost performance.
There are two kinds of parallelism in Concrete:
Loop parallelism to make tensor operations parallel, achieved by using OpenMP
Dataflow parallelism to make independent operations parallel, achieved by using HPX
Loop parallelism is enabled by default, as it's supported on all platforms. Dataflow parallelism however is only supported on Linux, hence not enabled by default.
This guide explains dataflow parallelism and how it can improve the execution time of Concrete circuits.
Dataflow parallelism is particularly useful when the circuit performs computations that are neither completely independent (such as loop/doall parallelism) nor fully dependent (e.g. sequential, non-parallelizable code). In such cases dataflow tasks can execute as soon as their inputs are available and thus minimizing over-synchronization.
Without dataflow parallelism, circuit is executed operation by operation, like an imperative language. If the operations themselves are not tensorized, loop parallelism would not be utilized and the entire execution would happen in a single thread. Dataflow parallelism changes this by analyzing the operations and their dependencies within the circuit to determine what can be done in parallel and what cannot. Then it distributes the tasks that can be done in parallel to different threads.
For example:
This prints:
The reason for that is:
To summarize, dataflow analyzes the circuit to determine which parts of the circuit can be run at the same time, and tries to run as many operations as possible in parallel.
When the circuit is tensorized, dataflow might slow execution down since the tensor operations already use multiple threads and adding dataflow on top creates congestion in the CPU between the HPX (dataflow parallelism runtime) and OpenMP (loop parallelism runtime). So try both before deciding on whether to use dataflow or not.
This guide teaches how to improve the execution time of Concrete circuits by using approximate mode for rounding.
You can enable approximate mode to gain even more performance when using rounding by sacrificing some more exactness:
prints:
This guide teaches how to improve the execution time of Concrete circuits by using different conversion strategies for complex operations.
Concrete provides multiple implementation strategies for these complex operations:
The default strategy is the one that doesn't increase the input bit width, even if it's less optimal than the others. If you don't care about the input bit widths (e.g., if the inputs are only used in this operation), you should definitely change the default strategy.
Choosing the correct strategy can lead to big speedups. So if you are not sure which one to use, you can compile with different strategies and compare the complexity.
For example, the following code:
prints:
or:
prints:
As you can see, strategies can affect the performance a lot! So make sure to select the appropriate one for your use case if you want to optimize performance.
This guide explains how to help Concrete Optimizer to select more performant parameters to improve the execution time of Concrete circuits.
The idea is to obtain more optimal cryptographic parameters (especially for table lookups) without changing the operations within the circuit.
This guide explains tensorization and how it can improve the execution time of Concrete circuits.
Tensors should be used instead of scalars when possible to maximize loop parallelism.
For example:
This prints:
Enabling dataflow is kind of letting the runtime do this for you. It'd also help in the specific case.
This guide teaches how to improve the execution time of Concrete circuits by reducing the amount of table lookups.
Reducing the amount of table lookups is probably the most complicated guide in this section as it's not automated. The idea is to use mathematical properties of operations to reduce the amount of table lookups needed to achieve the result.
One great example is in adding big integers in bitmap representation. Here is the basic implementation:
There are two table lookups within the loop body, one for >>
and one for %
.
This implementation is not optimal though, since the same output can be achieved with just a single table lookup:
It was possible to do this because the original operations had a mathematical equivalence with the optimized operations and optimized operations achieved the same output with less table lookups!
Here is the full code example and some numbers for this optimization:
prints:
which is almost half the amount of table lookups and ~2x less complexity for the same operation!
This guide teaches how costly table lookups are, and how to optimize them to improve the execution time of Concrete circuits.
The most costly operation in Concrete is the table lookup operation, so one of the primary goals of optimizing performance is to reduce the amount of table lookups.
Furthermore, the bit width of the input of the table lookup plays a major role in performance.
The code above prints:
This document explains how to deploy a circuit after the development. After developing your circuit, you may want to deploy it without sharing the circuit's details with every client or hosting computations on dedicated servers. In this scenario, you can use the Client
and Server
features of Concrete.
In a typical Concrete deployment:
The server hosts the compilation artifact, including client specifications and the FHE executable.
The client requests circuit requirements, generates keys, sends an encrypted payload, and receives an encrypted result.
Follow these steps to deploy your circuit:
Develop the circuit: You can develop your own circuit using the techniques discussed in previous chapters. Here is an example.
Save the server files: Once you have your circuit, save the necessary server files.
Send the server files: Send server.zip
to your computation server.
Load the server files: To set up the server, load the server.zip
file received from the development machine.
Prepare for client requests: The server needs to wait for the requests from clients.
Serialize ClientSpecs
: The requests typically starts with ClientSpecs
as clients need ClientSpecs
to generate keys and request computation.
Send serialized ClientSpecs
to clients.
Create the client object: After receiving the serialized ClientSpecs
from a server, create the Client
object.
Generate keys: Once you have the Client
object, perform key generation. This method generates encryption/decryption keys and evaluation keys.
Serialize the evaluation keys: The server needs access to the evaluation keys. You can serialize your evaluation keys as below.
Send the evaluation keys to the server.
Serialized evaluation keys are very large, even if they are compressed and can be reused several times: consider caching them on the server
Encrypt inputs: Encrypt your inputs and request the server to perform some computation. This can be done in the following way.
Send the serialized arguments to the server.
Deserialize received data: On the server, deserialize the received evaluation keys and arguments received from the client.
Run the computation: Perform the computation and serialize the result.
Send the serialized result to the client:
Clear arguments can directly be passed to server.run
(For example, server.run(x, 10, z, evaluation_keys=...)
).
Deserialize the result: Once you receive the serialized result from the server, deserialize it.
Decrypt the deserialized result:
Deploying a module follows the same logic as the deployment of circuits.
For example, consider a module compiled in the following way:
You can extract the server from the module and save it in a file:
The only noticeable difference between the deployment of modules and the deployment of circuits is that the methods Client::encrypt
, Client::decrypt
and Server::run
must contain an extra function_name
argument specifying the name of the targeted function.
For example, to encrypt an argument for the inc
function of the module:
To execute the inc
function:
To decrypt a result from the execution of dec
:
This document explains how to set up a shared secret key between Concrete and TFHE-rs to perform computations.
In this scenario, a shared secret key will be used, while different keysets will be held for Concrete and TFHE-rs. There are two ways to generate keys, outlined with the following steps
Perform a classical key generation in Concrete, which generates a set of secret and public keys.
Use this secret key to perform a partial key generation in TFHE-rs, starting from the shared secret key and generating the rest of the necessary keys.
Perform a classical key generation in TFHE-rs, generating a single secret key and corresponding public keys.
Use the secret key from TFHE-rs to perform a partial keygen in Concrete.
While TFHE-rs does use a single secret key, Concrete may generate more than one, but only one of these should be corresponding to the TFHE-rs key. The API does hide this detail, but will often ask you to provide the position of a given input/output. This will be used to infer which secret key should be used.
After the key generation is complete and we have both keysets, we can perform computations, encryption, and decryption on both ends.
The first step is to define the TFHE-rs ciphertext type that will be used in the computation (see Overview). This includes specifying both cryptographic and encoding parameters. TFHE-rs provides a pre-computed list of recommended parameters, which we will use to avoid manual selection. You can find the parameters used in this guide here.
In short, we first determine a suitable set of parameters from TFHE-rs and then apply them in Concrete. This ensures that the ciphertexts generated in both systems will be compatible by using the same cryptographic parameters.
We will now define a simple modular addition function. This function takes TFHE-rs inputs, converts them to Concrete format (to_native
), runs a computation, and then converts them back to TFHE-rs. The circuit below is a common example that takes and produces TFHE-rs ciphertexts. However, there are other scenarios where you might not convert back to TFHE-rs, or you might convert to a different type than the input. Another possibility is to take one native ciphertext and one TFHE-rs ciphertext.
We can compile the circuit as usual.
You could optionally try the full execution in Concrete
We are going to create a TFHE-rs bridge that facilitates the seamless transfer of ciphertexts and keys between Concrete and TFHE-rs.
In order to establish a shared secret key between Concrete and TFHE-rs, there are two possible methods for key generation. The first method (use case 1.1) involves generating the Concrete keyset first and then using the shared secret key in TFHE-rs to partially generate the TFHE-rs keyset. The second method (use case 1.2) involves doing the opposite. You should only run one of the two following methods.
Remember that one key generation need to be a partial keygen, to be sure that there is a unique and common secret key.
Parameters used in TFHE-rs must be the same as the ones used in Concrete.
First, we generate the Concrete keyset and then serialize the shared secret key that will be used to encrypt the inputs. In our case, this shared secret key is the same for all inputs and outputs.
Next, we generate client and server keys in TFHE-rs using the shared secret key from Concrete. We will cover serialization in a later section, so there's no need to worry about how we loaded the secret key. For now, we will consider having 4 functions (save_lwe_sk
, save_fheuint8
, load_lwe_sk
, load_fheuint8
) which respectively save/load an LWE secret key and an FheUint8 to/from a given path.
First, we generate the TFHE-rs keyset and then serialize the shared secret key that will be used to encrypt the inputs
Next, we generate a Concrete keyset using the shared secret key from TFHE-rs.
At this point, we have everything necessary to encrypt, compute, and decrypt on both Concrete and TFHE-rs. Whether you began key generation in Concrete or in TFHE-rs, the keysets on both sides are compatible.
Now, we'll walk through an encryption and computation process in TFHE-rs, transition to Concrete to run the circuit, and then return to TFHE-rs for decryption.
First, we do encryption and a simple addition in TFHE-rs. For more information on how to save ciphertexts, refer to Serialization.
Next, we can load these ciphertexts in Concrete and then run our compiled circuit as usual.
Finally, we can decrypt and decode in Concrete
... or export it to TFHE-rs for computation/decryption
Full working example can be found here.
This guide teaches how to improve the execution time of Concrete circuits by using some special operations that reduce the bit width of the input of the table lookup.
There are two extensions which can reduce the bit width of the table lookup input, and , which can improve performance by sacrificing exactness.
For example the following code:
prints:
And displays:
This guide teaches how to improve the execution time of Concrete circuits by using bit extraction.
is a cheap way to extract certain bits of encrypted values. It can be very useful for improving the performance of circuits.
For example:
prints:
That's almost 8x improvement to circuit complexity!
This guide explains how setting p_error
configuration option can affect the performance of Concrete circuits.
Adjusting table lookup error probability is discussed extensively in section. The idea is to sacrifice exactness to gain performance.
For example:
This prints:
This guide explains how to optimize cryptographic parameters by specifying composition when using .
When using , make sure to specify so that the compiler can select more optimal parameters based on how the functions in the module would be used.
For example:
This prints:
It means that specifying composition resulted in ~35% improvement to complexity for computing cube(square(x))
.
The Encrypted Game of Life in Python Using Concrete - November 2023
Compile composable functions with Concrete - February 2024
How to use dynamic table look-ups using Concrete - October 2023
Zama 5-Question Developer Survey
We want to hear from you! Take 1 minute to share your thoughts and helping us enhance our documentation and libraries. 👉 Click here to participate.
concrete.compiler
: Compiler submodule.
concrete.compiler.compilation_context
: CompilationContext.
concrete.compiler.compilation_feedback
: Compilation feedback.
concrete.compiler.tfhers_int
: Import and export TFHErs integers into Concrete.
concrete.compiler.utils
: Common utils for the compiler submodule.
concrete.fhe
: Concrete.
concrete.fhe.compilation
: Glue the compilation process together.
concrete.fhe.compilation.artifacts
: Declaration of DebugArtifacts
class.
concrete.fhe.compilation.circuit
: Declaration of Circuit
class.
concrete.fhe.compilation.client
: Declaration of Client
class.
concrete.fhe.compilation.compiler
: Declaration of Compiler
class.
concrete.fhe.compilation.composition
: Declaration of classes related to composition.
concrete.fhe.compilation.configuration
: Declaration of Configuration
class.
concrete.fhe.compilation.decorators
: Declaration of circuit
and compiler
decorators.
concrete.fhe.compilation.evaluation_keys
: Declaration of EvaluationKeys
.
concrete.fhe.compilation.keys
: Declaration of Keys
class.
concrete.fhe.compilation.module
: Declaration of FheModule
classes.
concrete.fhe.compilation.module_compiler
: Declaration of MultiCompiler
class.
concrete.fhe.compilation.server
: Declaration of Server
class.
concrete.fhe.compilation.specs
: Declaration of ClientSpecs
class.
concrete.fhe.compilation.status
: Declaration of EncryptionStatus
class.
concrete.fhe.compilation.utils
: Declaration of various functions and constants related to compilation.
concrete.fhe.compilation.value
: Declaration of Value
class.
concrete.fhe.compilation.wiring
: Declaration of wiring related class.
concrete.fhe.dtypes
: Define available data types and their semantics.
concrete.fhe.dtypes.base
: Declaration of BaseDataType
abstract class.
concrete.fhe.dtypes.float
: Declaration of Float
class.
concrete.fhe.dtypes.integer
: Declaration of Integer
class.
concrete.fhe.dtypes.utils
: Declaration of various functions and constants related to data types.
concrete.fhe.extensions
: Provide additional features that are not present in numpy.
concrete.fhe.extensions.array
: Declaration of array
function, to simplify creation of encrypted arrays.
concrete.fhe.extensions.bits
: Bit extraction extensions.
concrete.fhe.extensions.constant
: Declaration of constant
functions, to allow server side trivial encryption.
concrete.fhe.extensions.convolution
: Tracing and evaluation of convolution.
concrete.fhe.extensions.hint
: Declaration of hinting extensions, to provide more information to Concrete.
concrete.fhe.extensions.identity
: Declaration of identity
extension.
concrete.fhe.extensions.maxpool
: Tracing and evaluation of maxpool.
concrete.fhe.extensions.multivariate
: Declaration of multivariate
extension.
concrete.fhe.extensions.ones
: Declaration of ones
and one
functions, to simplify creation of encrypted ones.
concrete.fhe.extensions.relu
: Declaration of relu
extension.
concrete.fhe.extensions.round_bit_pattern
: Declaration of round_bit_pattern
function, to provide an interface for rounded table lookups.
concrete.fhe.extensions.table
: Declaration of LookupTable
class.
concrete.fhe.extensions.tag
: Declaration of tag
context manager, to allow tagging certain nodes.
concrete.fhe.extensions.truncate_bit_pattern
: Declaration of truncate_bit_pattern
extension.
concrete.fhe.extensions.univariate
: Declaration of univariate
function.
concrete.fhe.extensions.zeros
: Declaration of zeros
and zero
functions, to simplify creation of encrypted zeros.
concrete.fhe.internal.utils
: Declaration of various functions and constants related to the entire project.
concrete.fhe.mlir
: Provide computation graph
to mlir
functionality.
concrete.fhe.mlir.context
: Declaration of Context
class.
concrete.fhe.mlir.conversion
: Declaration of ConversionType
and Conversion
classes.
concrete.fhe.mlir.converter
: Declaration of Converter
class.
concrete.fhe.mlir.processors
: All graph processors.
concrete.fhe.mlir.processors.assign_bit_widths
: Declaration of AssignBitWidths
graph processor.
concrete.fhe.mlir.processors.assign_node_ids
: Declaration of AssignNodeIds
graph processor.
concrete.fhe.mlir.processors.check_integer_only
: Declaration of CheckIntegerOnly
graph processor.
concrete.fhe.mlir.processors.process_rounding
: Declaration of ProcessRounding
graph processor.
concrete.fhe.mlir.utils
: Declaration of various functions and constants related to MLIR conversion.
concrete.fhe.representation
: Define structures used to represent computation.
concrete.fhe.representation.evaluator
: Declaration of various Evaluator
classes, to make graphs picklable.
concrete.fhe.representation.graph
: Declaration of Graph
class.
concrete.fhe.representation.node
: Declaration of Node
class.
concrete.fhe.representation.operation
: Declaration of Operation
enum.
concrete.fhe.representation.utils
: Declaration of various functions and constants related to representation of computation.
concrete.fhe.tfhers
: tfhers module to represent, and compute on tfhers integer values.
concrete.fhe.tfhers.bridge
: Declaration of tfhers.Bridge
class.
concrete.fhe.tfhers.dtypes
: Declaration of TFHERSIntegerType
class.
concrete.fhe.tfhers.tracing
: Tracing of tfhers operations.
concrete.fhe.tfhers.values
: Declaration of TFHERSInteger
which wraps values as being of tfhers types.
concrete.fhe.tracing
: Provide function
to computation graph
functionality.
concrete.fhe.tracing.tracer
: Declaration of Tracer
class.
concrete.fhe.tracing.typing
: Declaration of type annotation.
concrete.fhe.values
: Define the available values and their semantics.
concrete.fhe.values.scalar
: Declaration of ClearScalar
and EncryptedScalar
wrappers.
concrete.fhe.values.tensor
: Declaration of ClearTensor
and EncryptedTensor
wrappers.
concrete.fhe.values.value_description
: Declaration of ValueDescription
class.
concrete.fhe.version
: Version of the project, which is updated automatically by the CI right before releasing.
concrete.lang
: Concretelang python module
concrete.lang.dialects.fhe
: FHE dialect module
concrete.lang.dialects.fhelinalg
: FHELinalg dialect module
concrete.lang.dialects.tracing
: Tracing dialect module
compilation_context.CompilationContext
: Compilation context.
compilation_feedback.MoreCircuitCompilationFeedback
: Helper class for compilation feedback.
tfhers_int.TfhersExporter
: A helper class to import and export TFHErs big integers.
artifacts.DebugArtifacts
: DebugArtifacts class, to export information about the compilation process for single function.
artifacts.DebugManager
: A debug manager, allowing streamlined debugging.
artifacts.FunctionDebugArtifacts
: An object containing debug artifacts for a certain function in an fhe module.
artifacts.ModuleDebugArtifacts
: An object containing debug artifacts for an fhe module.
circuit.Circuit
: Circuit class, to combine computation graph, mlir, client and server into a single object.
client.Client
: Client class, which can be used to manage keys, encrypt arguments and decrypt results.
compiler.Compiler
: Compiler class, to glue the compilation pipeline.
composition.CompositionClause
: A raw composition clause.
composition.CompositionPolicy
: A protocol for composition policies.
composition.CompositionRule
: A raw composition rule.
configuration.ApproximateRoundingConfig
: Controls the behavior of approximate rounding.
configuration.BitwiseStrategy
: BitwiseStrategy, to specify implementation preference for bitwise operations.
configuration.ComparisonStrategy
: ComparisonStrategy, to specify implementation preference for comparisons.
configuration.Configuration
: Configuration class, to allow the compilation process to be customized.
configuration.MinMaxStrategy
: MinMaxStrategy, to specify implementation preference for minimum and maximum operations.
configuration.MultiParameterStrategy
: MultiParamStrategy, to set optimization strategy for multi-parameter.
configuration.MultivariateStrategy
: MultivariateStrategy, to specify implementation preference for multivariate operations.
configuration.ParameterSelectionStrategy
: ParameterSelectionStrategy, to set optimization strategy.
configuration.SecurityLevel
: Security level used to optimize the circuit parameters.
decorators.Compilable
: Compilable class, to wrap a function and provide methods to trace and compile it.
evaluation_keys.EvaluationKeys
: EvaluationKeys required for execution.
keys.Keys
: Keys class, to manage generate/reuse keys.
module.ExecutionRt
: Runtime object class for execution.
module.FheFunction
: Fhe function class, allowing to run or simulate one function of an fhe module.
module.FheModule
: Fhe module class, to combine computation graphs, mlir, runtime objects into a single object.
module.SimulationRt
: Runtime object class for simulation.
module_compiler.FunctionDef
: An object representing the definition of a function as used in an fhe module.
module_compiler.ModuleCompiler
: Compiler class for multiple functions, to glue the compilation pipeline.
server.Server
: Server class, which can be used to perform homomorphic computation.
specs.ClientSpecs
: ClientSpecs class, to create Client objects.
status.EncryptionStatus
: EncryptionStatus enum, to represent encryption status of parameters.
utils.Lazy
: A lazyly initialized value.
value.Value
: A public value object that can be sent between client and server.
wiring.AllComposable
: Composition policy that allows to forward any output of the module to any of its input.
wiring.AllInputs
: All the encrypted inputs of a given function of a module.
wiring.AllOutputs
: All the encrypted outputs of a given function of a module.
wiring.Input
: The input of a given function of a module.
wiring.NotComposable
: Composition policy that does not allow the forwarding of any output to any input.
wiring.Output
: The output of a given function of a module.
wiring.TracedOutput
: A wrapper type used to trace wiring.
wiring.Wire
: A forwarding rule between an output and an input.
wiring.WireInput
: A protocol for wire inputs.
wiring.WireOutput
: A protocol for wire outputs.
wiring.WireTracingContextManager
: A context manager returned by the wire_pipeline
method.
wiring.Wired
: Composition policy which allows the forwarding of certain outputs to certain inputs.
base.BaseDataType
: BaseDataType abstract class, to form a basis for data types.
float.Float
: Float class, to represent floating point numbers.
integer.Integer
: Integer class, to represent integers.
bits.Bits
: Bits class, to provide indexing into the bits of integers.
round_bit_pattern.Adjusting
: Adjusting class, to be used as early stop signal during adjustment.
round_bit_pattern.AutoRounder
: AutoRounder class, to optimize for number of msbs to keep during round bit pattern operation.
table.LookupTable
: LookupTable class, to provide a way to do direct table lookups.
truncate_bit_pattern.Adjusting
: Adjusting class, to be used as early stop signal during adjustment.
truncate_bit_pattern.AutoTruncator
: AutoTruncator class, to optimize for the number of msbs to keep during truncate operation.
context.Context
: Context class, to perform operations on conversions.
conversion.Conversion
: Conversion class, to store MLIR operations with additional information.
conversion.ConversionType
: ConversionType class, to make it easier to work with MLIR types.
converter.Converter
: Converter class, to convert a computation graph to MLIR.
assign_bit_widths.AdditionalConstraints
: AdditionalConstraints class to customize bit-width assignment step easily.
assign_bit_widths.AssignBitWidths
: AssignBitWidths graph processor, to assign proper bit-widths to be compatible with FHE.
assign_node_ids.AssignNodeIds
to node properties.
check_integer_only.CheckIntegerOnly
: CheckIntegerOnly graph processor, to make sure the graph only contains integer nodes.
process_rounding.ProcessRounding
: ProcessRounding graph processor, to analyze rounding and support regular operations on it.
utils.Comparison
: Comparison enum, to store the result comparison in 2-bits as there are three possible outcomes.
utils.HashableNdarray
: HashableNdarray class, to use numpy arrays in dictionaries.
evaluator.ConstantEvaluator
: ConstantEvaluator class, to evaluate Operation.Constant nodes.
evaluator.GenericEvaluator
: GenericEvaluator class, to evaluate Operation.Generic nodes.
evaluator.GenericTupleEvaluator
: GenericEvaluator class, to evaluate Operation.Generic nodes where args are packed in a tuple.
evaluator.InputEvaluator
: InputEvaluator class, to evaluate Operation.Input nodes.
graph.Graph
: Graph class, to represent computation graphs.
graph.GraphProcessor
: GraphProcessor base class, to define the API for a graph processing pipeline.
graph.MultiGraphProcessor
: MultiGraphProcessor base class, to define the API for a multiple graph processing pipeline.
node.Node
: Node class, to represent computation in a computation graph.
operation.Operation
: Operation enum, to distinguish nodes within a computation graph.
bridge.Bridge
: TFHErs Bridge extend an Module with TFHErs functionalities.
dtypes.CryptoParams
: Crypto parameters used for a tfhers integer.
dtypes.EncryptionKeyChoice
: TFHErs key choice: big or small.
dtypes.TFHERSIntegerType
to represent tfhers integer types.
values.TFHERSInteger
into typed values, using tfhers types.
tracer.Annotation
: Base annotation for direct definition.
tracer.ScalarAnnotation
: Base scalar annotation for direct definition.
tracer.TensorAnnotation
: Base tensor annotation for direct definition.
tracer.Tracer
: Tracer class, to create computation graphs from python functions.
typing.f32
: Scalar f32 annotation.
typing.f64
: Scalar f64 annotation.
typing.int1
: Scalar int1 annotation.
typing.int10
: Scalar int10 annotation.
typing.int11
: Scalar int11 annotation.
typing.int12
: Scalar int12 annotation.
typing.int13
: Scalar int13 annotation.
typing.int14
: Scalar int14 annotation.
typing.int15
: Scalar int15 annotation.
typing.int16
: Scalar int16 annotation.
typing.int17
: Scalar int17 annotation.
typing.int18
: Scalar int18 annotation.
typing.int19
: Scalar int19 annotation.
typing.int2
: Scalar int2 annotation.
typing.int20
: Scalar int20 annotation.
typing.int21
: Scalar int21 annotation.
typing.int22
: Scalar int22 annotation.
typing.int23
: Scalar int23 annotation.
typing.int24
: Scalar int24 annotation.
typing.int25
: Scalar int25 annotation.
typing.int26
: Scalar int26 annotation.
typing.int27
: Scalar int27 annotation.
typing.int28
: Scalar int28 annotation.
typing.int29
: Scalar int29 annotation.
typing.int3
: Scalar int3 annotation.
typing.int30
: Scalar int30 annotation.
typing.int31
: Scalar int31 annotation.
typing.int32
: Scalar int32 annotation.
typing.int33
: Scalar int33 annotation.
typing.int34
: Scalar int34 annotation.
typing.int35
: Scalar int35 annotation.
typing.int36
: Scalar int36 annotation.
typing.int37
: Scalar int37 annotation.
typing.int38
: Scalar int38 annotation.
typing.int39
: Scalar int39 annotation.
typing.int4
: Scalar int4 annotation.
typing.int40
: Scalar int40 annotation.
typing.int41
: Scalar int41 annotation.
typing.int42
: Scalar int42 annotation.
typing.int43
: Scalar int43 annotation.
typing.int44
: Scalar int44 annotation.
typing.int45
: Scalar int45 annotation.
typing.int46
: Scalar int46 annotation.
typing.int47
: Scalar int47 annotation.
typing.int48
: Scalar int48 annotation.
typing.int49
: Scalar int49 annotation.
typing.int5
: Scalar int5 annotation.
typing.int50
: Scalar int50 annotation.
typing.int51
: Scalar int51 annotation.
typing.int52
: Scalar int52 annotation.
typing.int53
: Scalar int53 annotation.
typing.int54
: Scalar int54 annotation.
typing.int55
: Scalar int55 annotation.
typing.int56
: Scalar int56 annotation.
typing.int57
: Scalar int57 annotation.
typing.int58
: Scalar int58 annotation.
typing.int59
: Scalar int59 annotation.
typing.int6
: Scalar int6 annotation.
typing.int60
: Scalar int60 annotation.
typing.int61
: Scalar int61 annotation.
typing.int62
: Scalar int62 annotation.
typing.int63
: Scalar int63 annotation.
typing.int64
: Scalar int64 annotation.
typing.int7
: Scalar int7 annotation.
typing.int8
: Scalar int8 annotation.
typing.int9
: Scalar int9 annotation.
typing.tensor
: Tensor annotation.
typing.uint1
: Scalar uint1 annotation.
typing.uint10
: Scalar uint10 annotation.
typing.uint11
: Scalar uint11 annotation.
typing.uint12
: Scalar uint12 annotation.
typing.uint13
: Scalar uint13 annotation.
typing.uint14
: Scalar uint14 annotation.
typing.uint15
: Scalar uint15 annotation.
typing.uint16
: Scalar uint16 annotation.
typing.uint17
: Scalar uint17 annotation.
typing.uint18
: Scalar uint18 annotation.
typing.uint19
: Scalar uint19 annotation.
typing.uint2
: Scalar uint2 annotation.
typing.uint20
: Scalar uint20 annotation.
typing.uint21
: Scalar uint21 annotation.
typing.uint22
: Scalar uint22 annotation.
typing.uint23
: Scalar uint23 annotation.
typing.uint24
: Scalar uint24 annotation.
typing.uint25
: Scalar uint25 annotation.
typing.uint26
: Scalar uint26 annotation.
typing.uint27
: Scalar uint27 annotation.
typing.uint28
: Scalar uint28 annotation.
typing.uint29
: Scalar uint29 annotation.
typing.uint3
: Scalar uint3 annotation.
typing.uint30
: Scalar uint30 annotation.
typing.uint31
: Scalar uint31 annotation.
typing.uint32
: Scalar uint32 annotation.
typing.uint33
: Scalar uint33 annotation.
typing.uint34
: Scalar uint34 annotation.
typing.uint35
: Scalar uint35 annotation.
typing.uint36
: Scalar uint36 annotation.
typing.uint37
: Scalar uint37 annotation.
typing.uint38
: Scalar uint38 annotation.
typing.uint39
: Scalar uint39 annotation.
typing.uint4
: Scalar uint4 annotation.
typing.uint40
: Scalar uint40 annotation.
typing.uint41
: Scalar uint41 annotation.
typing.uint42
: Scalar uint42 annotation.
typing.uint43
: Scalar uint43 annotation.
typing.uint44
: Scalar uint44 annotation.
typing.uint45
: Scalar uint45 annotation.
typing.uint46
: Scalar uint46 annotation.
typing.uint47
: Scalar uint47 annotation.
typing.uint48
: Scalar uint48 annotation.
typing.uint49
: Scalar uint49 annotation.
typing.uint5
: Scalar uint5 annotation.
typing.uint50
: Scalar uint50 annotation.
typing.uint51
: Scalar uint51 annotation.
typing.uint52
: Scalar uint52 annotation.
typing.uint53
: Scalar uint53 annotation.
typing.uint54
: Scalar uint54 annotation.
typing.uint55
: Scalar uint55 annotation.
typing.uint56
: Scalar uint56 annotation.
typing.uint57
: Scalar uint57 annotation.
typing.uint58
: Scalar uint58 annotation.
typing.uint59
: Scalar uint59 annotation.
typing.uint6
: Scalar uint6 annotation.
typing.uint60
: Scalar uint60 annotation.
typing.uint61
: Scalar uint61 annotation.
typing.uint62
: Scalar uint62 annotation.
typing.uint63
: Scalar uint63 annotation.
typing.uint64
: Scalar uint64 annotation.
typing.uint7
: Scalar uint7 annotation.
typing.uint8
: Scalar uint8 annotation.
typing.uint9
: Scalar uint9 annotation.
value_description.ValueDescription
: ValueDescription class, to combine data type, shape, and encryption status into a single object.
compiler.check_gpu_available
: Check whether a CUDA device is available and online.
compiler.check_gpu_enabled
: Check whether the compiler and runtime support GPU offloading.
compiler.init_dfr
: Initialize dataflow parallelization.
compiler.round_trip
: Parse the MLIR input, then return it back.
compilation_feedback.tag_from_location
: Extract tag of the operation from its location.
utils.lookup_runtime_lib
: Try to find the absolute path to the runtime library.
decorators.circuit
: Provide a direct interface for compilation of single circuit programs.
decorators.compiler
: Provide an easy interface for the compilation of single-circuit programs.
decorators.function
: Provide an easy interface to define a function within an fhe module.
decorators.module
: Provide an easy interface for the compilation of multi functions modules.
utils.add_nodes_from_to
: Add nodes from from_nodes
to to_nodes
, to all_nodes
.
utils.check_subgraph_fusibility
: Determine if a subgraph can be fused.
utils.convert_subgraph_to_subgraph_node
: Convert a subgraph to Operation.Generic node.
utils.find_closest_integer_output_nodes
: Find the closest upstream integer output nodes to a set of start nodes in a graph.
utils.find_float_subgraph_with_unique_terminal_node
: Find a subgraph with float computations that end with an integer output.
utils.find_single_lca
: Find the single lowest common ancestor of a list of nodes.
utils.find_tlu_subgraph_with_multiple_variable_inputs_that_has_a_single_common_ancestor
: Find a subgraph with a tlu computation that has multiple variable inputs where all variable inputs share a common ancestor.
utils.friendly_type_format
: Convert a type to a string. Remove package name and class/type keywords.
utils.fuse
: Fuse appropriate subgraphs in a graph to a single Operation.Generic node.
utils.get_terminal_size
: Get the terminal size.
utils.inputset
: Generate a random inputset.
utils.is_single_common_ancestor
: Determine if a node is the single common ancestor of a list of nodes.
utils.validate_input_args
: Validate input arguments.
utils.combine_dtypes
: Get the 'BaseDataType' that can represent a set of 'BaseDataType's.
array.array
: Create an encrypted array from either encrypted or clear values.
bits.bits
: Extract bits of integers.
constant.constant
: Trivial encryption of a cleartext value.
convolution.conv
: Trace and evaluate convolution operations.
hint.hint
: Hint the compilation process about properties of a value.
identity.identity
: Apply identity function to x.
identity.refresh
: Refresh x.
maxpool.maxpool
: Evaluate or trace MaxPool operation.
multivariate.multivariate
: Wrap a multivariate function so that it is traced into a single generic node.
ones.one
: Create an encrypted scalar with the value of one.
ones.ones
: Create an encrypted array of ones.
ones.ones_like
: Create an encrypted array of ones with the same shape as another array.
relu.relu
: Rectified linear unit extension.
round_bit_pattern.round_bit_pattern
: Round the bit pattern of an integer.
tag.tag
: Introduce a new tag to the tag stack.
truncate_bit_pattern.truncate_bit_pattern
: Round the bit pattern of an integer.
univariate.univariate
: Wrap a univariate function so that it is traced into a single generic node.
zeros.zero
: Create an encrypted scalar with the value of zero.
zeros.zeros
: Create an encrypted array of zeros.
zeros.zeros_like
: Create an encrypted array of zeros with the same shape as another array.
utils.assert_that
: Assert a condition.
utils.unreachable
: Raise a RuntimeError to indicate unreachable code is entered.
utils.construct_deduplicated_tables
: Construct lookup tables for each cell of the input for an Operation.Generic node.
utils.construct_table
: Construct the lookup table for an Operation.Generic node.
utils.construct_table_multivariate
: Construct the lookup table for a multivariate node.
utils.flood_replace_none_values
: Use flooding algorithm to replace None
values.
utils.format_constant
: Get the textual representation of a constant.
utils.format_indexing_element
: Format an indexing element.
tfhers.get_type_from_params
: Get a TFHE-rs integer type from TFHE-rs parameters in JSON format.
bridge.new_bridge
: Create a TFHErs bridge from a circuit or module.
tracing.from_native
: Convert a Concrete integer to the tfhers representation.
tracing.to_native
: Convert a tfhers integer to the Concrete representation.
scalar.clear_scalar_builder
: Build a clear scalar value.
scalar.encrypted_scalar_builder
: Build an encrypted scalar value.
scalar.clear_scalar_builder
: Build a clear scalar value.
scalar.encrypted_scalar_builder
: Build an encrypted scalar value.
tensor.clear_tensor_builder
: Build a clear tensor value.
tensor.encrypted_tensor_builder
: Build an encrypted tensor value.
tensor.clear_tensor_builder
: Build a clear tensor value.
tensor.encrypted_tensor_builder
: Build an encrypted tensor value.
This document details the concept of truncating, and how it is used in Concrete to make some FHE computations especially faster.
Table lookups have a strict constraint on the number of bits they support. This can be limiting, especially if you don't need exact precision. As well as this, using larger bit-widths leads to slower table lookups.
To overcome these issues, truncated table lookups are introduced. This operation provides a way to zero the least significant bits of a large integer and then apply the table lookup on the resulting (smaller) value.
Imagine you have a 5-bit value, you can use fhe.truncate_bit_pattern(value, lsbs_to_remove=2)
to truncate it (here the last 2 bits are discarded). Once truncated, value will remain in 5-bits (e.g., 22 = 0b10110 would be truncated to 20 = 0b10100), and the last 2 bits of it would be zero. Concrete uses this to optimize table lookups on the truncated value, the 5-bit table lookup gets optimized to a 3-bit table lookup, which is much faster!
Let's see how truncation works in practice:
prints:
and displays:
Now, let's see how truncating can be used in FHE.
prints:
These speed-ups can vary from system to system.
The reason why the speed-up is not increasing with lsbs_to_remove
is because the truncating operation itself has a cost: each bit removal is a PBS. Therefore, if a lot of bits are removed, truncation itself could take longer than the bigger TLU which is evaluated afterwards.
and displays:
Truncating is very useful but, in some cases, you don't know how many bits your input contains, so it's not reliable to specify lsbs_to_remove
manually. For this reason, the AutoTruncator
class is introduced.
AutoTruncator
allows you to set how many of the most significant bits to keep, but they need to be adjusted using an inputset to determine how many of the least significant bits to remove. This can be done manually using fhe.AutoTruncator.adjust(function, inputset)
, or by setting auto_adjust_truncators
configuration to True
during compilation.
Here is how auto truncators can be used in FHE:
prints:
and displays:
AutoTruncator
s should be defined outside the function that is being compiled. They are used to store the result of the adjustment process, so they shouldn't be created each time the function is called. Furthermore, each AutoTruncator
should be used with exactly one truncate_bit_pattern
call.
This document details the concept of rounding, and how it is used in Concrete to make some FHE computations especially faster.
Table lookups have a strict constraint on the number of bits they support. This can be limiting, especially if you don't need exact precision. As well as this, using larger bit-widths leads to slower table lookups.
To overcome these issues, rounded table lookups are introduced. This operation provides a way to round the least significant bits of a large integer and then apply the table lookup on the resulting (smaller) value.
Imagine you have a 5-bit value, but you want to have a 3-bit table lookup. You can call fhe.round_bit_pattern(input, lsbs_to_remove=2)
and use the 3-bit value you receive as input to the table lookup.
Let's see how rounding works in practice:
prints:
and displays:
If the rounded number is one of the last 2**(lsbs_to_remove - 1)
numbers in the input range [0, 2**original_bit_width)
, an overflow will happen.
By default, if an overflow is encountered during inputset evaluation, bit-widths will be adjusted accordingly. This results in a loss of speed, but ensures accuracy.
You can turn this overflow protection off (e.g., for performance) by using fhe.round_bit_pattern(..., overflow_protection=False)
. However, this could lead to unexpected behavior at runtime.
Now, let's see how rounding can be used in FHE.
prints:
These speed-ups can vary from system to system.
The reason why the speed-up is not increasing with lsbs_to_remove
is because the rounding operation itself has a cost: each bit removal is a PBS. Therefore, if a lot of bits are removed, rounding itself could take longer than the bigger TLU which is evaluated afterwards.
and displays:
Feel free to disable overflow protection and see what happens.
Rounding is very useful but, in some cases, you don't know how many bits your input contains, so it's not reliable to specify lsbs_to_remove
manually. For this reason, the AutoRounder
class is introduced.
AutoRounder
allows you to set how many of the most significant bits to keep, but they need to be adjusted using an inputset to determine how many of the least significant bits to remove. This can be done manually using fhe.AutoRounder.adjust(function, inputset)
, or by setting auto_adjust_rounders
configuration to True
during compilation.
Here is how auto rounders can be used in FHE:
prints:
and displays:
AutoRounder
s should be defined outside the function that is being compiled. They are used to store the result of the adjustment process, so they shouldn't be created each time the function is called. Furthermore, each AutoRounder
should be used with exactly one round_bit_pattern
call.
One use of rounding is doing faster computation by ignoring the lower significant bits. For this usage, you can even get faster results if you accept the rounding it-self to be slightly inexact. The speedup is usually around 2x-3x but can be higher for big precision reduction. This also enable higher precisions values that are not possible otherwise.
You can turn on this mode either globally on the configuration:
or on/off locally:
In approximate mode the rounding threshold up or down is not perfectly centered: The off-centering is:
is bounded, i.e. at worst an off-by-one on the reduced precision value compared to the exact result,
is pseudo-random, i.e. it will be different on each call,
almost symmetrically distributed,
depends on cryptographic properties like the encryption mask, the encryption noise and the crypto-parameters.
With approximate rounding, you can enable an approximate clipping to get further improved performance in the case of overflow handling. Approximate clipping enable to discard the extra bit of overflow protection bit in the successor TLU. For consistency a logical clipping is available when this optimization is not suitable.
When fast approximate clipping is not suitable (i.e. slower), it's better to apply logical clipping for consistency and better resilience to code change. It has no extra cost since it's fuzed with the successor TLU.
This set the first precision where approximate clipping is enabled, starting from this precision, an extra small precision TLU is introduced to safely remove the extra precision bit used to contain overflow. This way the successor TLU is faster. E.g. for a rounding to 7bits, that finishes to a TLU of 8bits due to overflow, forcing to use a TLU of 7bits is 3x faster.
This document describes how floating points are treated and manipulated in Concrete.
Concrete partly supports floating points. There is no support for floating point inputs or outputs. However, there is support for intermediate values to be floating points (under certain constraints). Also, we note that one can use an equivalent of fixed points in Concrete, as described in .
Concrete-Compile, which is used for compiling the circuit, doesn't support floating points at all. However, it supports table lookups which take an integer and map it to another integer. The constraints of this operation are that there should be a single integer input, and a single integer output.
As long as your floating point operations comply with those constraints, Concrete automatically converts them to a table lookup operation:
In the example above, a
, b
, and c
are floating point intermediates. They are used to calculate d
, which is an integer with a value dependent upon x
, which is also an integer. Concrete detects this and fuses all of these operations into a single table lookup from x
to d
.
This approach works for a variety of use cases, but it comes up short for others:
This results in:
The reason for the error is that d
no longer depends solely on x
; it depends on y
as well. Concrete cannot fuse these operations, so it raises an exception instead.
This document lists the operations you can use inside the function that you are compiling.
Some operations are not supported between two encrypted values. If attempted, a detailed error message will be raised.
ndarray
methods.ndarray
properties.This document explains the different passes happening in the compilation process, from the Concrete Python frontend to the Concrete MLIR compiler.
There are two main entry points to the Concrete Compiler. The first is to use the Concrete Python frontend. The second is to use the Compiler directly, which takes as input. Concrete Python is more high level and uses the Compiler under the hood.
Compilation begins in the frontend with tracing to get an easy-to-manipulate representation of the function. We call this representation a Computation Graph
, which is a Directed Acyclic Graph (DAG) containing nodes representing computations done in the function. Working with graphs is useful because they have been studied extensively and there are a lot of available algorithms to manipulate them. Internally, we use , which is an excellent graph library for Python.
The next step in compilation is transforming the computation graph. There are many transformations we perform, and these are discussed in their own sections. The result of a transformation is another computation graph.
After transformations are applied, we need to determine the bounds (i.e., the minimum and the maximum values) of each intermediate node. This is required because FHE allows limited precision for computations. Measuring these bounds helps determine the required precision for the function.
The frontend is almost done at this stage and only needs to transform the computation graph to equivalent MLIR
code. Once the MLIR
is generated, our Compiler backend takes over. Any other frontend wishing to use the Compiler needs to plugin at this stage.
The Compiler takes MLIR
code that makes use of both the FHE
and FHELinalg
for scalar and tensor operations respectively.
Compilation then ends with a series of that generates a native binary which contains executable code. Crypto parameters are generated along the way as well.
We start with a Python function f
, such as this one:
The goal of tracing is to create the following computation graph without requiring any change from the user.
(Note that the edge labels are for non-commutative operations. To give an example, a subtraction node represents (predecessor with edge label 0) - (predecessor with edge label 1)
)
To do this, we make use of Tracer
s, which are objects that record the operation performed during their creation. We create a Tracer
for each argument of the function and call the function with those Tracer
s. Tracer
s make use of the operator overloading feature of Python to achieve their goal:
2 * y
will be performed first, and *
is overloaded for Tracer
to return another tracer: Tracer(computation=Multiply(Constant(2), self.computation))
, which is equal to Tracer(computation=Multiply(Constant(2), Input("y")))
.
x + (2 * y)
will be performed next, and +
is overloaded for Tracer
to return another tracer: Tracer(computation=Add(self.computation, (2 * y).computation))
, which is equal to Tracer(computation=Add(Input("x"), Multiply(Constant(2), Input("y")))
.
In the end, we will have output tracers that can be used to create the computation graph. The implementation is a bit more complex than this, but the idea is the same.
Tracing is also responsible for indicating whether the values in the node would be encrypted or not. The rule for that is: if a node has an encrypted predecessor, it is encrypted as well.
The goal of topological transforms is to make more functions compilable.
With the current version of Concrete, floating-point inputs and floating-point outputs are not supported. However, if the floating-point operations are intermediate operations, they can sometimes be fused into a single table lookup from integer to integer, thanks to some specific transforms.
Let's take a closer look at the transforms we can currently perform.
Given a computation graph, the goal of the bounds measurement step is to assign the minimal data type to each node in the graph.
If we have an encrypted input that is always between 0
and 10
, we should assign the type EncryptedScalar<uint4>
to the node of this input as EncryptedScalar<uint4>
. This is the minimal encrypted integer that supports all values between 0
and 10
.
If there were negative values in the range, we could have used intX
instead of uintX
.
Bounds measurement is necessary because FHE supports limited precision, and we don't want unexpected behaviour while evaluating the compiled functions.
Let's take a closer look at how we perform bounds measurement.
This is a simple approach that requires an inputset to be provided by the user.
The inputset is not to be confused with the dataset, which is classical in ML, as it doesn't require labels. Rather, the inputset is a set of values which are typical inputs of the function.
The idea is to evaluate each input in the inputset and record the result of each operation in the computation graph. Then we compare the evaluation results with the current minimum/maximum values of each node and update the minimum/maximum accordingly. After the entire inputset is evaluated, we assign a data type to each node using the minimum and maximum values it contains.
Here is an example, given this computation graph where x
is encrypted:
and this inputset:
Evaluation result of 2
:
x
: 2
2
: 2
*
: 4
3
: 3
+
: 7
New bounds:
x
: [2, 2]
2
: [2, 2]
*
: [4, 4]
3
: [3, 3]
+
: [7, 7]
Evaluation result of 3
:
x
: 3
2
: 2
*
: 6
3
: 3
+
: 9
New bounds:
x
: [2, 3]
2
: [2, 2]
*
: [4, 6]
3
: [3, 3]
+
: [7, 9]
Evaluation result of 1
:
x
: 1
2
: 2
*
: 2
3
: 3
+
: 5
New bounds:
x
: [1, 3]
2
: [2, 2]
*
: [2, 6]
3
: [3, 3]
+
: [5, 9]
Assigned data types:
x
: EncryptedScalar<uint2>
2
: ClearScalar<uint2>
*
: EncryptedScalar<uint3>
3
: ClearScalar<uint2>
+
: EncryptedScalar<uint4>
We describe below some of the main passes in the compilation pipeline.
TFHE Parameterization takes care of introducing the chosen parameters in the Intermediate Representation (IR). After this pass, you should be able to see the dimension of ciphertexts, as well as other parameters in the IR.
This pass lowers TFHE operations to low level operations that are closer to the backend implementation, working on tensors and memory buffers (after a bufferization pass).
This pass lowers everything to LLVM-IR in order to generate the final binary.
This document details the management of Table Lookups(TLU) within Concrete for advanced usage. For a simpler guide, refer to the .
One of the most common operations in Concrete are Table Lookups
(TLUs). All operations except addition, subtraction, multiplication with non-encrypted values, tensor manipulation operations, and a few operations built with those primitive operations (e.g. matmul, conv) are converted to Table Lookups under the hood.
Table Lookups are very flexible. They allow Concrete to support many operations, but they are expensive. The exact cost depends on many variables (hardware used, error probability, etc.), but they are always much more expensive compared to other operations. You should try to avoid them as much as possible. It's not always possible to avoid them completely, but you might remove the number of TLUs or replace some of them with other primitive operations.
Concrete automatically parallelizes TLUs if they are applied to tensors.
Concrete provides a LookupTable
class to create your own tables and apply them in your circuits.
LookupTable
s can have any number of elements. Let's call the number of elements N. As long as the lookup variable is within the range [-N, N), the Table Lookup is valid.
If you go outside of this range, you will receive the following error:
You can create the lookup table using a list of integers and apply it using indexing:
When you apply a table lookup to a tensor, the scalar table lookup is applied to each element of the tensor:
LookupTable
mimics array indexing in Python, which means if the lookup variable is negative, the table is looked up from the back:
If you want to apply a different lookup table to each element of a tensor, you can have a LookupTable
of LookupTable
s:
In this example, we applied a squared
table to the first column and a cubed
table to the second column.
Concrete tries to fuse some operations into table lookups automatically so that lookup tables don't need to be created manually:
All lookup tables need to be from integers to integers. So, without .astype(np.int64)
, Concrete will not be able to fuse.
The function is first traced into:
Concrete then fuses appropriate nodes:
Fusing makes the code more readable and easier to modify, so try to utilize it over manual LookupTable
s as much as possible.
TLUs are performed with an FHE operation called Programmable Bootstrapping
(PBS). PBSs have a certain probability of error: when these errors happen, it results in inaccurate results.
Let's say you have the table:
And you perform a Table Lookup using 4
. The result you should get is lut[4] = 16
, but because of the possibility of error, you could get any other value in the table.
The probability of this error can be configured through the p_error
and global_p_error
configuration options. The difference between these two options is that, p_error
is for individual TLUs but global_p_error
is for the whole circuit.
If you set p_error
to 0.01
, for example, it means every TLU in the circuit will have a 99% chance (or more) of being exact. If there is a single TLU in the circuit, it corresponds to global_p_error = 0.01
as well. But if we have 2 TLUs, then global_p_error
would be higher: that's 1 - (0.99 * 0.99) ~= 0.02 = 2%
.
If you set global_p_error
to 0.01
, the whole circuit will have at most 1% probability of error, no matter how many Table Lookups are included (which means that p_error
will be smaller than 0.01
if there are more than a single TLU).
If you set both of them, both will be satisfied. Essentially, the stricter one will be used.
By default, both p_error
and global_p_error
are set to None
, which results in a global_p_error
of 1 / 100_000
being used.
Configuring either of those variables impacts compilation and execution times (compilation, keys generation, circuit execution) and space requirements (size of the keys on disk and in memory). Lower error probabilities result in longer compilation and execution times and larger space requirements.
We have allocated a whole new chapter to explaining fusing. You can find it .
This pass converts high level operations which are not crypto specific to lower level operations from the TFHE scheme. Ciphertexts get introduced in the code as well. TFHE operations and ciphertexts require some parameters which need to be chosen, and the pass does just that.
We refer the users to for explanations about fhe.univariate(function)
and fhe.multivariate(function)
features, which are convenient ways to use automatically created table lookup.
Feel free to play with these configuration options to pick the one best suited for your needs! See to learn how you can set a custom p_error
and/or global_p_error
.
PBSs are very expensive, in terms of computations. Fortunately, it is sometimes possible to replace PBS by , or even . These TLUs have a slightly different semantic, but are very useful in cases like machine learning for more efficiency without drop of accuracy.
This document explains how to compute minimum and maximum between values in Concrete, covering different strategies to make computations faster, depending on the strategy.
Finding the minimum or maximum of two numbers is not a native operation in Concrete, so it needs to be implemented using existing native operations (i.e., additions, clear multiplications, negations, table lookups). Concrete offers two different implementations for this.
This is the most general implementation that can be used in any situation. The idea is:
Initial comparison is chunked as well, which is already very expensive.
Multiplication with operands aren't allowed to increase the bit-width of the inputs, so they are very expensive as well.
Optimal chunk size is selected automatically to reduce the number of table lookups.
Chunked comparisons result in at least 9 and at most 21 table lookups.
It is used if no other implementation can be used.
Can be used with any integers.
Extremely expensive.
produces
This implementation uses the fact that [min,max](x, y)
is equal to [min, max](x - y, 0) + y
, which is just a subtraction, a table lookup and an addition!
There are two major problems with this implementation though:
subtraction before the TLU requires up to 2 additional bits to avoid overflows (it is 1 in most cases).
subtraction and addition require the same bit-width across operands.
What this means is that if we are comparing uint3
and uint6
, we need to convert both of them to uint7
in some way to do the subtraction and proceed with the TLU in 7-bits. There are 2 ways to achieve this behavior.
This strategy makes sure that during bit-width assignment, both operands are assigned the same bit-width, and that bit-width contains at least the amount of bits required to store x - y
. The idea is:
It will always result in a single table lookup.
It will increase the bit-width of both operands and the result, and lock them together across the whole circuit, which can result in significant slowdowns if the result or the operands are used in other costly operations.
produces
This strategy will not put any constraint on bit-widths during bit-width assignment. Instead, operands are cast to a bit-width that can store x - y
during runtime using table lookups. The idea is:
It can result in a single table lookup as well, if x and y are assigned (because of other operations) the same bit-width, and that bit-width can store x - y
.
Or in two table lookups if only one of the operands is assigned a bit-width bigger than or equal to the bit width that can store x - y
.
It will not put any constraints on bit-widths of the operands, which is amazing if they are used in other costly operations.
It will result in at most 3 table lookups, which is still good.
If you are not doing anything else with the operands, or doing less costly operations compared to comparison, it will introduce up to two unnecessary table lookups and slow down execution compared to fhe.MinMaxStrategy.ONE_TLU_PROMOTED
.
produces
CHUNKED
9
21
ONE_TLU_PROMOTED
1
1
✓
THREE_TLU_CASTED
1
3
Concrete will choose the best strategy available after bit-width assignment, regardless of the specified preference.
Different strategies are good for different circuits. If you want the best runtime for your use case, you can compile your circuit with all different comparison strategy preferences, and pick the one with the lowest complexity.
This document describes how comparisons are managed in Concrete, typically "AND", "OR", and so on. It covers different strategies to make the FHE computations faster, depending on the context.
Bitwise operations are not native operations in Concrete, so they need to be implemented using existing native operations (i.e., additions, clear multiplications, negations, table lookups). Concrete offers two different implementations for performing bitwise operations.
This is the most general implementation that can be used in any situation. The idea is:
Signed bitwise operations are not supported.
The optimal chunk size is selected automatically to reduce the number of table lookups.
Chunked bitwise operations result in at least 4 and at most 9 table lookups.
It is used if no other implementation can be used.
Can be used with any integers.
Very expensive.
produces
This implementation uses the fact that we can combine two values into a single value and apply a single table lookup to this combined value!
There are two major problems with this implementation:
packing requires the same bit-width across operands.
packing requires the bit-width of at least x.bit_width + y.bit_width
and that bit-width cannot exceed maximum TLU bit-width, which is 16
at the moment.
What this means is if we are comparing uint3
and uint6
, we need to convert both of them to uint9
in some way to do the packing and proceed with the TLU in 9-bits. There are 4 ways to achieve this behavior.
This strategy makes sure that during bit-width assignment, both operands are assigned the same bit-width, and that bit-width contains at least the amount of bits required to store pack(x, y)
. The idea is:
It will always result in a single table lookup.
It will significantly increase the bit-width of both operands and lock them to each other across the whole circuit, which can result in significant slowdowns if the operands are used in other costly operations.
produces
This strategy will not put any constraint on bit-widths during bit-width assignment, instead operands are cast to a bit-width that can store pack(x, y)
during runtime using table lookups. The idea is:
It can result in a single table lookup as well, if x and y are assigned (because of other operations) the same bit-width, and that bit-width can store pack(x, y)
.
Or in two table lookups if only one of the operands is assigned a bit-width bigger than or equal to the bit width that can store pack(x, y)
.
It will not put any constraints on bit-widths of the operands, which is amazing if they are used in other costly operations.
It will result in at most 3 table lookups, which is still good.
If you are not doing anything else with the operands, or doing less costly operations compared to bitwise, it will introduce up to two unnecessary table lookups and slow down execution compared to fhe.BitwiseStrategy.ONE_TLU_PROMOTED
.
produces
This strategy can be viewed as a middle ground between the two strategies described above. With this strategy, only the bigger operand will be constrained to have at least the required bit-width to store pack(x, y)
, and the smaller operand will be cast to that bit-width during runtime. The idea is:
It can result in a single table lookup as well, if the smaller operand is assigned (because of other operations) the same bit-width as the bigger operand.
It will only put a constraint on the bigger operand, which is great if the smaller operand is used in other costly operations.
It will result in at most 2 table lookups, which is great.
It will significantly increase the bit-width of the bigger operand which can result in significant slowdowns if the bigger operand is used in other costly operations.
If you are not doing anything else with the smaller operand, or doing less costly operations compared to comparison, it could introduce an unnecessary table lookup and slow down execution compared to fhe.BitwiseStrategy.THREE_TLU_CASTED
.
produces
This strategy is like the exact opposite of the strategy above. With this, only the smaller operand will be constrained to have at least the required bit-width, and the bigger operand will be cast during runtime. The idea is:
It can result in a single table lookup as well, if the bigger operand is assigned (because of other operations) the same bit-width as the smaller operand.
It will only put constraint on the smaller operand, which is great if the bigger operand is used in other costly operations.
It will result in at most 2 table lookups, which is great.
It will increase the bit-width of the smaller operand which can result in significant slowdowns if the smaller operand is used in other costly operations.
If you are not doing anything else with the bigger operand, or doing less costly operations compared to comparison, it could introduce an unnecessary table lookup and slow down execution compared to fhe.BitwiseStrategy.THREE_TLU_CASTED
.
produces
CHUNKED
4
9
ONE_TLU_PROMOTED
1
1
✓
THREE_TLU_CASTED
1
3
TWO_TLU_BIGGER_PROMOTED_SMALLER_CASTED
1
2
✓
TWO_TLU_BIGGER_CASTED_SMALLER_PROMOTED
1
2
✓
Concrete will choose the best strategy available after bit-width assignment, regardless of the specified preference.
Different strategies are good for different circuits. If you want the best runtime for your use case, you can compile your circuit with all different comparison strategy preferences, and pick the one with the lowest complexity.
The same configuration option is used to modify the behavior of encrypted shift operations, and shifts are much more complex to implement, so we'll not go over the details. What is important is, that the end result is computed using additions or subtractions on the original shifted operand. Since additions and subtractions require the same bit-width across operands, input and output bit-widths need to be synchronized at some point. There are two ways to do this:
Here, the shifted operand and shift result are assigned the same bit-width during bit-width assignment, which avoids an additional TLU on the shifted operand. On the other hand, it might increase the bit-width of the result or the shifted operand, and if they're used in other costly operations, it could result in significant slowdowns. This is the default behavior.
produces
The approach described above could be suboptimal for some circuits, so it is advised to check the complexity with it disabled before production. Here is how the implementation changes with it disabled.
produces
This document describes how comparisons are managed in Concrete, typically 'equal', 'greater than', and so on. It covers different strategies to make the FHE computations faster, depending on the context.
Comparisons are not native operations in Concrete, so they need to be implemented using existing native operations (i.e., additions, clear multiplications, negations, table lookups). Concrete offers three different implementations for performing comparisons.
This is the most general implementation that can be used in any situation. The idea is:
Signed comparisons are more complex to explain, but they are supported!
The optimal chunk size is selected automatically to reduce the number of table lookups.
Chunked comparisons result in at least 5 and at most 13 table lookups.
It is used if no other implementation can be used.
==
and !=
are using a different chunk comparison and reduction strategy with less table lookups.
Can be used with any integers.
Very expensive.
produces
This implementation uses the fact that x [<,<=,==,!=,>=,>] y
is equal to x - y [<,<=,==,!=,>=,>] 0
, which is just a subtraction and a table lookup!
There are two major problems with this implementation:
subtraction before the TLU requires up to 2 additional bits to avoid overflows (it is 1 in most cases).
subtraction requires the same bit-width across operands.
What this means is if we are comparing uint3
and uint6
, we need to convert both of them to uint7
in some way to do the subtraction and proceed with the TLU in 7-bits. There are 4 ways to achieve this behavior.
This strategy makes sure that during bit-width assignment, both operands are assigned the same bit-width, and that bit-width contains at least the number of bits required to store x - y
. The idea is:
It will always result in a single table lookup.
It will increase the bit-width of both operands and lock them to each other across the whole circuit, which can result in significant slowdowns if the operands are used in other costly operations.
produces
This strategy will not put any constraint on bit-widths during bit-width assignment, instead operands are cast to a bit-width that can store x - y
during runtime using table lookups. The idea is:
It can result in a single table lookup, if x and y are assigned (because of other operations) the same bit-width and that bit-width can store x - y
.
Alternatively, two table lookups can be used if only one of the operands is assigned a bit-width bigger than or equal to the bit width that can store x - y
.
It will not put any constraints on the bit-widths of the operands, which is amazing if they are used in other costly operations.
It will result in at most 3 table lookups, which is still good.
If you are not doing anything else with the operands, or doing less costly operations compared to comparison, it will introduce up to two unnecessary table lookups and slow down execution compared to fhe.ComparisonStrategy.ONE_TLU_PROMOTED
.
produces
This strategy can be seen as a middle ground between the two strategies described above. With this strategy, only the bigger operand will be constrained to have at least the required bit-width to store x - y
, and the smaller operand will be cast to that bit-width during runtime. The idea is:
It can result in a single table lookup, if the smaller operand is assigned (because of other operations) the same bit-width as the bigger operand.
It will only put a constraint on the bigger operand, which is great if the smaller operand is used in other costly operations.
It will result in at most 2 table lookups, which is great.
It will increase the bit-width of the bigger operand, which can result in significant slowdowns if the bigger operand is used in other costly operations.
If you are not doing anything else with the smaller operand, or doing less costly operations compared to comparison, it could introduce an unnecessary table lookup and slow down execution compared to fhe.ComparisonStrategy.THREE_TLU_CASTED
.
produces
This strategy can be seen as the exact opposite of the strategy above. With this, only the smaller operand will be constrained to have at least the required bit-width, and the bigger operand will be cast during runtime. The idea is:
It can result in a single table lookup, if the bigger operand is assigned (because of other operations) the same bit-width as the smaller operand.
It will only put a constraint on the smaller operand, which is great if the bigger operand is used in other costly operations.
It will result in at most 2 table lookups, which is great.
It will increase the bit-width of the smaller operand, which can result in significant slowdowns if the smaller operand is used in other costly operations.
If you are not doing anything else with the bigger operand, or doing less costly operations compared to comparison, it could introduce an unnecessary table lookup and slow down execution compared to fhe.ComparisonStrategy.THREE_TLU_CASTED
.
produces
This implementation uses the fact that the subtraction trick is not optimal in terms of the required intermediate bit width. The comparison result does not change if we compare(3, 40)
or compare(3, 4)
, so why not clipping the bigger operand and then doing the subtraction to use less bits!
There are two major problems with this implementation:
it can not be used when the bit-widths are the same (for some cases even when they differ by only one bit)
subtraction still requires the same bit-width across operands.
What this means is if we are comparing uint3
and uint6
, we need to convert both of them to uint4
in some way to do the subtraction and proceed with the TLU in 7-bits. There are 2 ways to achieve this behavior.
This strategy will not put any constraint on bit-widths during bit-width assignment, instead the smaller operand is cast to a bit-width that can store clipped(bigger) - smaller
or smaller - clipped(bigger)
during runtime using table lookups. The idea is:
This is a fallback implementation, so if there is a difference of 1-bit (or in some cases 2-bits) and the subtraction trick cannot be used optimally, this implementation will be used instead of fhe.ComparisonStrategy.CHUNKED
.
It can result in two table lookups if the smaller operand is assigned a bit-width bigger than or equal to the bit width that can store clipped(bigger) - smaller
or smaller - clipped(bigger)
.
It will not put any constraints on the bit-widths of the operands, which is amazing if they are used in other costly operations.
It will result in at most 3 table lookups, which is still good.
These table lookups will be on smaller bit-widths, which is great.
Cannot be used to compare integers with the same bit-width, which is very common.
produces
This strategy is similar to the strategy described above. The difference is that with this strategy, the smaller operand will be constrained to have at least the required bit-width to store clipped(bigger) - smaller
or smaller - clipped(bigger)
. The bigger operand will still be clipped to that bit-width during runtime. The idea is:
It will only put a constraint on the smaller operand, which is great if the bigger operand is used in other costly operations.
It will result in exactly 2 table lookups, which is great.
It will increase the bit-width of the bigger operand, which can result in significant slowdowns if the bigger operand is used in other costly operations.
produces
CHUNKED
5
13
ONE_TLU_PROMOTED
1
1
✓
THREE_TLU_CASTED
1
3
TWO_TLU_BIGGER_PROMOTED_SMALLER_CASTED
1
2
✓
TWO_TLU_BIGGER_CASTED_SMALLER_PROMOTED
1
2
✓
THREE_TLU_BIGGER_CLIPPED_SMALLER_CASTED
2
3
TWO_TLU_BIGGER_CLIPPED_SMALLER_PROMOTED
2
2
✓
Concrete will choose the best strategy available after bit-width assignment, regardless of the specified preference.
Different strategies are good for different circuits. If you want the best runtime for your use case, you can compile your circuit with all different comparison strategy preferences, and pick the one with the lowest complexity.
This document explains the concept of tagging, which is a debugging tool to make a link between the user's Python code and the Concrete MLIR circuits. Such a link can be useful when an issue is raised by the compiler on some MLIR, to know which Python code it corresponds to.
When you have big circuits, keeping track of which node corresponds to which part of your code becomes difficult. A tagging system can simplify such situations:
When you compile f
with inputset of range(10)
, you get the following graph:
If you get an error, you'll see exactly where the error occurred (e.g., which layer of the neural network, if you tag layers).
In the future, we plan to use tags for additional features (e.g., to measure performance of tagged regions), so it's a good idea to start utilizing them for big circuits.
This document explains the concept of direct circuits in Concrete, which is another way to compile circuit without having to give a proper inputset.
Direct circuits are still experimental. It is very easy to make mistakes (e.g., due to no overflow checks or type coercion) while using direct circuits, so utilize them with care.
For some applications, the data types of inputs, intermediate values, and outputs are known (e.g., for manipulating bytes, you would want to use uint8). Using inputsets to determine bounds in these cases is not necessary, and can even be error-prone. Therefore, another interface for defining such circuits is introduced:
There are a few differences between direct circuits and traditional circuits:
Remember that the resulting dtype for each operation will be determined by its inputs. This can lead to some unexpected results if you're not careful (e.g., if you do -x
where x: fhe.uint8
, you won't receive a negative value as the result will be fhe.uint8
as well)
There is no inputset evaluation when using fhe types in .astype(...)
calls (e.g., np.sqrt(x).astype(fhe.uint4)
), so the bit width of the output cannot be determined.
Specify the resulting data type in univariate extension (e.g., fhe.univariate(function, outputs=fhe.uint4)(x)
), for the same reason as above.
Be careful with overflows. With inputset evaluation, you'll get bigger bit widths but no overflows. With direct definition, you must ensure that there aren't any overflows manually.
Let's review a more complicated example to see how direct circuits behave:
This prints:
Here is the breakdown of the assigned data types:
As you can see, %8
is subtraction of two unsigned values, and the result is unsigned as well. In the case that c > d
, we have an overflow, and this results in undefined behavior.
There are two ways to contribute to Concrete. You can:
Open issues to report bugs and typos or suggest ideas;
Request to become an official contributor by emailing hello@zama.ai. Only approved contributors can send pull requests (PRs), so get in touch before you do.
This document describes the concept of fusing, which is the act of combining multiple nodes into a single node, which is converted to a Table Lookup.
Code related to fusing is in the frontends/concrete-python/concrete/fhe/compilation/utils.py
file. Fusing can be performed using the fuse
function.
Within fuse
:
We loop until there are no more subgraphs to fuse.
Within each iteration: 2.1. We find a subgraph to fuse.
2.2. We search for a terminal node that is appropriate for fusing.
2.3. We crawl backwards to find the closest integer nodes to this node.
2.4. If there is a single node as such, we return the subgraph from this node to the terminal node.
2.5. Otherwise, we try to find the lowest common ancestor (lca) of this list of nodes.
2.6. If an lca doesn't exist, we say this particular terminal node is not fusable, and we go back to search for another subgraph.
2.7. Otherwise, we use this lca as the input of the subgraph and continue with subgraph
node creation below.
2.8. We convert the subgraph into a subgraph
node by checking fusability status of the nodes of the subgraph in this step.
2.9. We substitute the subgraph
node to the original graph.
With the current implementation, we cannot fuse subgraphs that depend on multiple encrypted values where those values don't have a common lca (e.g., np.round(np.sin(x) + np.cos(y))
).
*Using the default configuration in approximate mode. For 3, 4, 5 and 6 reduced precision bits and accumulator precision up to 32bits
In blue the exact value, the red dots are approximate values due to off-centered transition in approximate mode.
Histogram of transitions off-centering delta. Each count correspond to a specific random mask and a specific encryption noise.
Only the last step is clipped.
The last steps are decreased.
Concrete is a modular framework composed by sub-projects using different technologies, all having theirs own build system and test suite. Each sub-project have is own README that explain how to setup the developer environment, how to build it and how to run tests commands.
Concrete is made of 4 main categories of sub-project that are organized in subdirectories from the root of the Concrete repo:
frontends
contains high-level transpilers that target end users developers who want to use the Concrete stack easily from their usual environment. There are for now only one frontend provided by the Concrete project: a Python frontend named concrete-python
.
compilers
contains the sub-projects in charge of actually solving the compilation problem of an high-level abstraction of FHE to an actual executable. concrete-optimizer
is a Rust based project that solves the optimization problems of an FHE dag to a TFHE dag and concrete-compiler
which use concrete-optimizer
is an end-to-end MLIR-based compiler that takes a crypto free FHE dialect and generates compilation artifacts both for the client and the server. concrete-compiler
project provide in addition of the compilation engine, a client and server library in order to easily play with the compilation artifacts to implement a client and server protocol.
backends
contains CAPI that can be called by the concrete-compiler
runtime to perform the cryptographic operations. There are currently two backends:
concrete-cpu
, using TFHE-rs that implement the fastest implementation of TFHE on CPU.
concrete-cuda
that provides a GPU acceleration of TFHE primitives.
tools
are basically every other sub-projects that cannot be classified in the three previous categories and which are used as a common support by the others.
The module structure of Concrete Python. You are encouraged to check individual .py
files to learn more.
concrete
fhe
dtypes: data type specifications (e.g., int4, uint5, float32)
values: value specifications (i.e., data type + shape + encryption status)
representation: representation of computation (e.g., computation graphs, nodes)
tracing: tracing of python functions
extensions: custom functionality (see )
mlir: computation graph to mlir conversion
compilation: configuration, compiler, artifacts, circuit, client/server, and anything else related to compilation
The Concrete backends are implementations of the cryptographic primitives of the Zama variant of . The compiler emits code which combines call into these backends to perform more complex homomorphic operations.
There are client and server features.
Client features are:
private (G)LWE key generation (currently random bits)
encryption of ciphertexts using a private key
public key generation from private keys for keyswitch, bootstrap or private packing
(de)serialization of ciphertexts and public keys (also needed server side)
Server features are homomorphic operations on ciphertexts:
linear operations (multisums with plain weights)
keyswitch
simple PBS
WoP PBS
There are currently 2 backends:
concrete-cpu
which implements both client and server features targeting the CPU.
concrete-cuda
which implements only server features targeting GPUs to accelerate homomorphic circuit evaluation.
The compiler uses concrete-cpu
for the client and can use either concrete-cpu
or concrete-cuda
for the server.
This document describes some security concepts around FHE that can help you generate parameters that are both secure and correct.
To select secure cryptographic parameters for usage in Concrete, we utilize the . In particular, we use the following workflow:
Data Acquisition
For a given value of we obtain raw data from the Lattice Estimator, which ultimately leads to a security level . All relevant attacks in the Lattice Estimator are considered.
Increase the value of , until the tuple satisfies the target level of security .
Repeat for several values of .
Model Generation for .
At this point, we have several sets of points satisfying the target level of security . From here, we fit a model to this raw data ( as a function of ).
Model Verification.
For each model, we perform a verification check to ensure that the values output from the function provide the claimed level of security, .
These models are then used as input for Concrete, to ensure that the parameter space explored by the compiler attains the required security level. Note that we consider the RC.BDGL16
lattice reduction cost model within the Lattice Estimator. Therefore, when computing our security estimates, we use the call LWE.estimate(params, red_cost_model = RC.BDGL16)
on the input parameter set params
.
The cryptographic parameters are chosen considering the IND-CPA security model, and are selected with a bootstrapping failure probability fixed by the user. In particular, it is assumed that the results of decrypted computations are not shared by the secret key owner with any third parties, as such an action can lead to leakage of the secret encryption key. If you are designing an application where decryptions must be shared, you will need to craft custom encryption parameters which are chosen in consideration of the IND-CPA^D security model [1].
[1] Li, Baiyu, et al. “Securing approximate homomorphic encryption using differential privacy.” Annual International Cryptology Conference. Cham: Springer Nature Switzerland, 2022. https://eprint.iacr.org/2022/816.pdf
To generate the raw data from the lattice estimator, use::
To compare the current curves with the output of the lattice estimator, use:
To generate the associated cpp and rust code, use::
further advanced options can be found inside the Makefile.
This object is a tuple containing the information required for the four security curves ({80, 112, 128, 192} bits of security). Looking at one of the entries:
This document provides an overview of Fully Homomorphic Encryption (FHE) to get you started with Concrete. For more comprehensive resources about FHE, visit or .
Homomorphic encryption allows computations on ciphertexts without revealing the underlying plaintexts. A scheme is considered if it supports an unlimited number of additions and multiplications.
Let represent a plaintext and the corresponding ciphertext:
Homomorphic addition:
Homomorphic multiplication:
FHE encrypts data as LWE ciphertexts, represented visually as a bit vector. The encrypted message is located in the higher-order (yellow) bits, while the lower-order (gray) bits contain random noise that ensures the security of the ciphertext.
Each operation on an encrypted value increases the noise, and if it becomes too large, it may overlap with the message and corrupt its value. To reduce the noise of a ciphertext, the Bootstrap operation generates a new ciphertext encrypting the same message, but with lower noise. This allows additional operations to be performed on the encrypted message.
In typical FHE programs, operations are followed by a bootstrap, and this sequence repeats multiple times.
The amount of noise in a ciphertext is not as bounded as it may appear in the above illustration. As the errors are drawn randomly from a Gaussian distribution, they can be of varying size. This means that we need to be careful to ensure the noise terms do not affect the message bits. If the error terms do overflow into the message bits, this can cause an incorrect output (failure) when bootstrapping.
The noise in a ciphertext isn't strictly bounded, as errors are drawn from a Gaussian distribution and vary in size. If the noise grows too large, it may corrupt the message bits, causing incorrect outputs during bootstrapping.
Concrete uses PBS to evaluate functions homomorphically:
For example, consider a function (or circuit) that takes a 4 bits input variable and output the maximum value between a clear constant and the encrypted input:
This function could be turned into a table lookup:
The Lookup table lut
being applied during the Programmable Bootstrap.
You don't need to manage PBS operations manually, as they are handled automatically by Concrete during the compilation process. Each function evaluation is converted into a lookup table and evaluated via PBS.
For example, if you inspect the MLIR code generated by the frontend, you’ll see the lookup table in the 4th line of the following output:
There are 2 things to keep in mind about PBS:
Input type constraints: PBS operations adds constraints on input type and thus limits the maximum bit-width supported in Concrete.
by default, this script will generate parameter curves for {80, 112, 128, 192} bits of security, using .
this will compare the four curves generated above against the output of the version of the lattice estimator found in the .
To look at the raw data gathered in step 1., we can look in the . These objects can be loaded in the following way using SageMath:
entries are tuples of the form: . We can view individual entries via::
To view the interpolated curves we load the verified_curves.sobj
object inside the .
Here we can see the linear model parameters along with the security level 128. This linear model can be used to generate secure parameters in the following way: for , if we have an LWE dimension of , then the required noise size is:
This value corresponds to the logarithm of the relative error size. Using the parameter set in the Lattice Estimator confirms a 128-bit security level.
In Concrete, the default failure probability is set to , meaning that 1 in every 100,000 executions may result in an error. Reducing this probability requires adjusting cryptographic parameters, potentially lowering performance. Conversely, allowing a higher probability of error may improve performance.
While we’ve covered arithmetic operations, typical programs also involve functions (for example, maximum, minimum, square root). In , the Bootstrap operation can be enhanced with a , creating a Programmable Bootstrap (PBS).
Homomorphic univariate function evaluation:
PBS performance impact: PBS operations are costly, so minimizing the number of PBS can improve circuit performance. PBS cost also varies with input precision (for example, an 8-bit PBS is faster than a 16-bit PBS). To learn more about optimizing PBS, refer to the section.
concrete-optimizer
is a tool that selects appropriate cryptographic parameters for a given fully homomorphic encryption (FHE) computation. These parameters have an impact on the security, correctness, and efficiency of the computation.
The computation is guaranteed to be secure with the given level of security (see here for details) which is typically 128 bits. The correctness of the computation is guaranteed up to a given failure probability. A surrogate of the execution time is minimized which allows for efficient FHE computation.
The cryptographic parameters are degrees of freedom in the FHE algorithms (bootstrapping, keyswitching, etc.) that need to be fixed. The search space for possible crypto-parameters is finite but extremely large. The role of the optimizer is to quickly find the most efficient crypto-parameters possible while guaranteeing security and correctness.
The security level is chosen by the user. We typically operate at a fixed security level, such as 128 bits, to ensure that there is never a trade-off between security and efficiency. This constraint imposes a minimum amount of noise in all ciphertexts.
An independent public research tool, the lattice estimator, is used to estimate the security level. The lattice estimator is maintained by FHE experts. For a given set of crypto-parameters, this tool considers all possible attacks and returns a security level.
For each security level, a parameter curve of the appropriate minimal error level is pre-computed using the lattice estimator, and is used as an input to the optimizer. Learn more about the parameter curves here.
Correctness decreases as the level of noise increases. Noise accumulates during homomorphic computation until it is actively reduced via bootstrapping. Too much noise can lead to the result of a computation being inaccurate or completely incorrect.
Before optimization, we compute a noise bound that guarantees a given error level (under the assumption that noise growth is correctly managed via bootstrapping). The noise growth depends on a critical quantity: the 2-norm of any dot product (or equivalent) present in the calculus. This 2-norm changes the scale of the noise, so we must reduce it sufficiently for the next dot product operation whenever we reduce the noise.
The user can control error probability in two ways: via the PBS error probability and the global error probability.
The PBS error probability controls correctness locally (i.e., represents the error probability of a single PBS operation), while the global error probability focuses on the overall computation result (i.e., represents the error probability of the entire computation). These probabilities are related, and choosing which one to use may depend on the specific use case.
Efficiency decreases as more precision is required, e.g. 7-bits versus 8-bits. The larger the 2-norm is, the bigger the noise will be after a dot product. To remain below the noise bound, we must ensure that the inputs to the dot product have a sufficiently small noise level. The smaller this noise is, the slower the previous bootstrapping will be. Therefore, the larger the 2norm is, the slower the computation will be.
The optimization prioritizes security and correctness. This means that the security level (or the probability of correctness) could, in practice, be a bit higher than the level which is requested by the user.
In the simplest case, the optimizer performs an exhaustive search in the full parameter space and selects the best solution. While the space to explore is huge, exact lower bound cuts are used to avoid exploring regions which are guaranteed to not contain an optimal point. This makes the process both fast and exhaustive. This case is called mono-parameter, where all parameters are shared by the whole computation graph.
In more complex cases, the optimizer iteratively performs an exhaustive search, with lower bound cuts in a wide subspace of the full parameter space, until it converges to a locally optimal solution. Since the wide subspace is large and multi-dimensional, it should not be trapped in a poor locally optimal solution. The more complex case is called multi-parameter, where different calculus operations have tailored parameters.
One can have a look at reference crypto-parameters for each security level (but for a given correctness). This provides insight between the calcululs content (i.e. maximum precision, maximum dot 2-norm, etc.,) and the cost.
Then one can manually explore crypto-parameters space using a CLI tool.
If you use this tool in your work, please cite:
Bergerat, Loris and Boudi, Anas and Bourgerie, Quentin and Chillotti, Ilaria and Ligier, Damien and Orfila Jean-Baptiste and Tap, Samuel, Parameter Optimization and Larger Precision for (T)FHE, Journal of Cryptology, 2023, Volume 36
A pre-print is available as Cryptology ePrint Archive Paper 2022/704
The Concrete backends are implementations of the cryptographic primitives of the Zama variant of TFHE.
There are client features (private and public key generation, encryption and decryption) and server features (homomorphic operations on ciphertexts using public keys).
Considering that
performance improvements are mostly beneficial for the server operations
the client needs to be portable for the variety of clients that may exist, we expect mostly server backend to be added to the compiler to improve performance (e.g. by using specialized hardware)
The server backend should expose C or C++ functions to do TFHE operations using the current ciphertext and key memory representation (or functions to change representation). A backend can support only a subset of the current TFHE operations.
The most common operations one would be expected to add are WP-PBS (standard TFHE programmable bootstrap), keyswitch and WoP (without padding bootsrap).
Linear operations may also be supported but may need more work since their introduction may interfere with other compilation passes. The following example does not include this.
We will detail how concrete-cuda
is integrated in the compiler. Adding a new server feature backend (for non linear operations) should be quite similar. However, if you want to integrate a backend but it does not fit with this description, please open an issue or contact us to discuss the integration.
In compilers/concrete-compiler/Makefile
the variable CUDA_SUPPORT
has been added and set to OFF
(CUDA_SUPPORT?=OFF
) by default
the variables CUDA_SUPPORT
and CUDA_PATH
are passed to CMake
In compilers/concrete-compiler/compiler/include/concretelang/Runtime/context.h
, the RuntimeContext
struct is enriched with state to manage the backend ressources (behind a #ifdef CONCRETELANG_CUDA_SUPPORT
).
In compilers/concrete-compiler/compiler/lib/Runtime/wrappers.cpp
, the cuda backend server functions are added (behind a #ifdef CONCRETELANG_CUDA_SUPPORT
)
The pass ConcreteToCAPI
is modified to have a flag to insert calls to these new wrappers instead of the cpu ones (the code calling this pass is modified accordingly).
It may be possible to replace the cpu wrappers (with a compilation flag) instead of adding new ones to avoid having to change the pass.
In compilers/concrete-compiler/CMakeLists.txt
a Section #Concrete Cuda Configuration
has been added Other CMakeLists.txt
have also been modified (or added) with if(CONCRETELANG_CUDA_SUPPORT)
guard to handle header includes, linking...
High Level Fully Homomorphic Encryption dialect A dialect for representation of high level operation on fully homomorphic ciphertext.
TFHE.batched_add_glwe_cst_int
(::mlir::concretelang::TFHE::ABatchedAddGLWECstIntOp)Batched version of AddGLWEIntOp
Traits: AlwaysSpeculatableImplTrait
Interfaces: ConditionallySpeculatable, NoMemoryEffect (MemoryEffectOpInterface)
Effects: MemoryEffects::Effect{}
ciphertext
A GLWE ciphertext
plaintexts
1D tensor of integer values
result
1D tensor of A GLWE ciphertext values
TFHE.batched_add_glwe_int_cst
(::mlir::concretelang::TFHE::ABatchedAddGLWEIntCstOp)Batched version of AddGLWEIntOp
Traits: AlwaysSpeculatableImplTrait
Interfaces: ConditionallySpeculatable, NoMemoryEffect (MemoryEffectOpInterface)
Effects: MemoryEffects::Effect{}
ciphertexts
1D tensor of A GLWE ciphertext values
plaintext
integer
result
1D tensor of A GLWE ciphertext values
TFHE.batched_add_glwe_int
(::mlir::concretelang::TFHE::ABatchedAddGLWEIntOp)Batched version of AddGLWEIntOp
Traits: AlwaysSpeculatableImplTrait
Interfaces: ConditionallySpeculatable, NoMemoryEffect (MemoryEffectOpInterface)
Effects: MemoryEffects::Effect{}
ciphertexts
1D tensor of A GLWE ciphertext values
plaintexts
1D tensor of integer values
result
1D tensor of A GLWE ciphertext values
TFHE.batched_add_glwe
(::mlir::concretelang::TFHE::ABatchedAddGLWEOp)Batched version of AddGLWEOp
Traits: AlwaysSpeculatableImplTrait
Interfaces: ConditionallySpeculatable, NoMemoryEffect (MemoryEffectOpInterface)
Effects: MemoryEffects::Effect{}
ciphertexts_a
1D tensor of A GLWE ciphertext values
ciphertexts_b
1D tensor of A GLWE ciphertext values
result
1D tensor of A GLWE ciphertext values
TFHE.add_glwe_int
(::mlir::concretelang::TFHE::AddGLWEIntOp)Returns the sum of a clear integer and an lwe ciphertext
Traits: AlwaysSpeculatableImplTrait
Interfaces: BatchableOpInterface, ConditionallySpeculatable, NoMemoryEffect (MemoryEffectOpInterface)
Effects: MemoryEffects::Effect{}
a
A GLWE ciphertext
b
integer
«unnamed»
A GLWE ciphertext
TFHE.add_glwe
(::mlir::concretelang::TFHE::AddGLWEOp)Returns the sum of two lwe ciphertexts
Traits: AlwaysSpeculatableImplTrait
Interfaces: BatchableOpInterface, ConditionallySpeculatable, NoMemoryEffect (MemoryEffectOpInterface)
Effects: MemoryEffects::Effect{}
a
A GLWE ciphertext
b
A GLWE ciphertext
«unnamed»
A GLWE ciphertext
TFHE.batched_bootstrap_glwe
(::mlir::concretelang::TFHE::BatchedBootstrapGLWEOp)Batched version of KeySwitchGLWEOp
Traits: AlwaysSpeculatableImplTrait
Interfaces: ConditionallySpeculatable, NoMemoryEffect (MemoryEffectOpInterface)
Effects: MemoryEffects::Effect{}
key
::mlir::concretelang::TFHE::GLWEBootstrapKeyAttr
An attribute representing bootstrap key.
ciphertexts
1D tensor of A GLWE ciphertext values
lookup_table
1D tensor of 64-bit signless integer values
result
1D tensor of A GLWE ciphertext values
TFHE.batched_keyswitch_glwe
(::mlir::concretelang::TFHE::BatchedKeySwitchGLWEOp)Batched version of KeySwitchGLWEOp
Traits: AlwaysSpeculatableImplTrait
Interfaces: ConditionallySpeculatable, NoMemoryEffect (MemoryEffectOpInterface)
Effects: MemoryEffects::Effect{}
key
::mlir::concretelang::TFHE::GLWEKeyswitchKeyAttr
An attribute representing keyswitch key.
ciphertexts
1D tensor of A GLWE ciphertext values
result
1D tensor of A GLWE ciphertext values
TFHE.batched_mapped_bootstrap_glwe
(::mlir::concretelang::TFHE::BatchedMappedBootstrapGLWEOp)Batched version of KeySwitchGLWEOp which also batches the lookup table
Traits: AlwaysSpeculatableImplTrait
Interfaces: ConditionallySpeculatable, NoMemoryEffect (MemoryEffectOpInterface)
Effects: MemoryEffects::Effect{}
key
::mlir::concretelang::TFHE::GLWEBootstrapKeyAttr
An attribute representing bootstrap key.
ciphertexts
1D tensor of A GLWE ciphertext values
lookup_table
2D tensor of 64-bit signless integer values
result
1D tensor of A GLWE ciphertext values
TFHE.batched_mul_glwe_cst_int
(::mlir::concretelang::TFHE::BatchedMulGLWECstIntOp)Batched version of MulGLWECstIntOp
Traits: AlwaysSpeculatableImplTrait
Interfaces: ConditionallySpeculatable, NoMemoryEffect (MemoryEffectOpInterface)
Effects: MemoryEffects::Effect{}
ciphertext
A GLWE ciphertext
cleartexts
1D tensor of integer values
result
1D tensor of A GLWE ciphertext values
TFHE.batched_mul_glwe_int_cst
(::mlir::concretelang::TFHE::BatchedMulGLWEIntCstOp)Batched version of MulGLWEIntCstOp
Traits: AlwaysSpeculatableImplTrait
Interfaces: ConditionallySpeculatable, NoMemoryEffect (MemoryEffectOpInterface)
Effects: MemoryEffects::Effect{}
ciphertexts
1D tensor of A GLWE ciphertext values
cleartext
integer
result
1D tensor of A GLWE ciphertext values
TFHE.batched_mul_glwe_int
(::mlir::concretelang::TFHE::BatchedMulGLWEIntOp)Batched version of MulGLWEIntOp
Traits: AlwaysSpeculatableImplTrait
Interfaces: ConditionallySpeculatable, NoMemoryEffect (MemoryEffectOpInterface)
Effects: MemoryEffects::Effect{}
ciphertexts
1D tensor of A GLWE ciphertext values
cleartexts
1D tensor of integer values
result
1D tensor of A GLWE ciphertext values
TFHE.batched_neg_glwe
(::mlir::concretelang::TFHE::BatchedNegGLWEOp)Batched version of NegGLWEOp
Traits: AlwaysSpeculatableImplTrait
Interfaces: ConditionallySpeculatable, NoMemoryEffect (MemoryEffectOpInterface)
Effects: MemoryEffects::Effect{}
ciphertexts
1D tensor of A GLWE ciphertext values
result
1D tensor of A GLWE ciphertext values
TFHE.bootstrap_glwe
(::mlir::concretelang::TFHE::BootstrapGLWEOp)Programmable bootstraping of a GLWE ciphertext with a lookup table
Traits: AlwaysSpeculatableImplTrait
Interfaces: BatchableOpInterface, ConditionallySpeculatable, NoMemoryEffect (MemoryEffectOpInterface)
Effects: MemoryEffects::Effect{}
key
::mlir::concretelang::TFHE::GLWEBootstrapKeyAttr
An attribute representing bootstrap key.
ciphertext
A GLWE ciphertext
lookup_table
1D tensor of 64-bit signless integer values
result
A GLWE ciphertext
TFHE.encode_expand_lut_for_bootstrap
(::mlir::concretelang::TFHE::EncodeExpandLutForBootstrapOp)Encode and expand a lookup table so that it can be used for a bootstrap.
Traits: AlwaysSpeculatableImplTrait
Interfaces: ConditionallySpeculatable, NoMemoryEffect (MemoryEffectOpInterface)
Effects: MemoryEffects::Effect{}
polySize
::mlir::IntegerAttr
32-bit signless integer attribute
outputBits
::mlir::IntegerAttr
32-bit signless integer attribute
isSigned
::mlir::BoolAttr
bool attribute
input_lookup_table
1D tensor of 64-bit signless integer values
result
1D tensor of 64-bit signless integer values
TFHE.encode_lut_for_crt_woppbs
(::mlir::concretelang::TFHE::EncodeLutForCrtWopPBSOp)Encode and expand a lookup table so that it can be used for a wop pbs.
Traits: AlwaysSpeculatableImplTrait
Interfaces: ConditionallySpeculatable, NoMemoryEffect (MemoryEffectOpInterface)
Effects: MemoryEffects::Effect{}
crtDecomposition
::mlir::ArrayAttr
64-bit integer array attribute
crtBits
::mlir::ArrayAttr
64-bit integer array attribute
modulusProduct
::mlir::IntegerAttr
32-bit signless integer attribute
isSigned
::mlir::BoolAttr
bool attribute
input_lookup_table
1D tensor of 64-bit signless integer values
result
2D tensor of 64-bit signless integer values
TFHE.encode_plaintext_with_crt
(::mlir::concretelang::TFHE::EncodePlaintextWithCrtOp)Encodes a plaintext by decomposing it on a crt basis.
Traits: AlwaysSpeculatableImplTrait
Interfaces: ConditionallySpeculatable, NoMemoryEffect (MemoryEffectOpInterface)
Effects: MemoryEffects::Effect{}
mods
::mlir::ArrayAttr
64-bit integer array attribute
modsProd
::mlir::IntegerAttr
64-bit signless integer attribute
input
64-bit signless integer
result
1D tensor of 64-bit signless integer values
TFHE.keyswitch_glwe
(::mlir::concretelang::TFHE::KeySwitchGLWEOp)Change the encryption parameters of a glwe ciphertext by applying a keyswitch
Traits: AlwaysSpeculatableImplTrait
Interfaces: BatchableOpInterface, ConditionallySpeculatable, NoMemoryEffect (MemoryEffectOpInterface)
Effects: MemoryEffects::Effect{}
key
::mlir::concretelang::TFHE::GLWEKeyswitchKeyAttr
An attribute representing keyswitch key.
ciphertext
A GLWE ciphertext
result
A GLWE ciphertext
TFHE.mul_glwe_int
(::mlir::concretelang::TFHE::MulGLWEIntOp)Returns the product of a clear integer and an lwe ciphertext
Traits: AlwaysSpeculatableImplTrait
Interfaces: BatchableOpInterface, ConditionallySpeculatable, NoMemoryEffect (MemoryEffectOpInterface)
Effects: MemoryEffects::Effect{}
a
A GLWE ciphertext
b
integer
«unnamed»
A GLWE ciphertext
TFHE.neg_glwe
(::mlir::concretelang::TFHE::NegGLWEOp)Negates a glwe ciphertext
Traits: AlwaysSpeculatableImplTrait
Interfaces: BatchableOpInterface, ConditionallySpeculatable, NoMemoryEffect (MemoryEffectOpInterface)
Effects: MemoryEffects::Effect{}
a
A GLWE ciphertext
«unnamed»
A GLWE ciphertext
TFHE.sub_int_glwe
(::mlir::concretelang::TFHE::SubGLWEIntOp)Substracts an integer and a GLWE ciphertext
Traits: AlwaysSpeculatableImplTrait
Interfaces: ConditionallySpeculatable, NoMemoryEffect (MemoryEffectOpInterface)
Effects: MemoryEffects::Effect{}
a
integer
b
A GLWE ciphertext
«unnamed»
A GLWE ciphertext
TFHE.wop_pbs_glwe
(::mlir::concretelang::TFHE::WopPBSGLWEOp)Traits: AlwaysSpeculatableImplTrait
Interfaces: ConditionallySpeculatable, NoMemoryEffect (MemoryEffectOpInterface)
Effects: MemoryEffects::Effect{}
ksk
::mlir::concretelang::TFHE::GLWEKeyswitchKeyAttr
An attribute representing keyswitch key.
bsk
::mlir::concretelang::TFHE::GLWEBootstrapKeyAttr
An attribute representing bootstrap key.
pksk
::mlir::concretelang::TFHE::GLWEPackingKeyswitchKeyAttr
An attribute representing Wop Pbs key.
crtDecomposition
::mlir::ArrayAttr
64-bit integer array attribute
cbsLevels
::mlir::IntegerAttr
32-bit signless integer attribute
cbsBaseLog
::mlir::IntegerAttr
32-bit signless integer attribute
ciphertexts
lookupTable
2D tensor of 64-bit signless integer values
result
TFHE.zero
(::mlir::concretelang::TFHE::ZeroGLWEOp)Returns a trivial encryption of 0
Traits: AlwaysSpeculatableImplTrait
Interfaces: ConditionallySpeculatable, NoMemoryEffect (MemoryEffectOpInterface)
Effects: MemoryEffects::Effect{}
out
A GLWE ciphertext
TFHE.zero_tensor
(::mlir::concretelang::TFHE::ZeroTensorGLWEOp)Returns a tensor containing trivial encryptions of 0
Traits: AlwaysSpeculatableImplTrait
Interfaces: ConditionallySpeculatable, NoMemoryEffect (MemoryEffectOpInterface)
Effects: MemoryEffects::Effect{}
tensor
An attribute representing bootstrap key.
Syntax:
inputKey
mlir::concretelang::TFHE::GLWESecretKey
outputKey
mlir::concretelang::TFHE::GLWESecretKey
polySize
int
glweDim
int
levels
int
baseLog
int
index
int
An attribute representing keyswitch key.
Syntax:
inputKey
mlir::concretelang::TFHE::GLWESecretKey
outputKey
mlir::concretelang::TFHE::GLWESecretKey
levels
int
baseLog
int
index
int
An attribute representing Wop Pbs key.
Syntax:
inputKey
mlir::concretelang::TFHE::GLWESecretKey
outputKey
mlir::concretelang::TFHE::GLWESecretKey
outputPolySize
int
innerLweDim
int
glweDim
int
levels
int
baseLog
int
index
int
A GLWE ciphertext
An GLWE cipher text
key
mlir::concretelang::TFHE::GLWESecretKey
High Level Fully Homomorphic Encryption dialect A dialect for representation of high level operation on fully homomorphic ciphertext.
FHE.add_eint_int
(::mlir::concretelang::FHE::AddEintIntOp)Adds an encrypted integer and a clear integer
The clear integer must have at most one more bit than the encrypted integer and the result must have the same width and the same signedness as the encrypted integer.
Example:
Traits: AlwaysSpeculatableImplTrait
Interfaces: Binary, BinaryEintInt, ConditionallySpeculatable, MaxNoise, NoMemoryEffect (MemoryEffectOpInterface)
Effects: MemoryEffects::Effect{}
FHE.add_eint
(::mlir::concretelang::FHE::AddEintOp)Adds two encrypted integers
The encrypted integers and the result must have the same width and the same signedness.
Example:
Traits: AlwaysSpeculatableImplTrait
Interfaces: AdditiveNoise, BinaryEint, ConditionallySpeculatable, NoMemoryEffect (MemoryEffectOpInterface)
Effects: MemoryEffects::Effect{}
FHE.apply_lookup_table
(::mlir::concretelang::FHE::ApplyLookupTableEintOp)Applies a clear lookup table to an encrypted integer
The width of the result can be different than the width of the operand. The lookup table must be a tensor of size 2^p
where p
is the width of the encrypted integer.
Example:
Traits: AlwaysSpeculatableImplTrait
Interfaces: ConditionallySpeculatable, ConstantNoise, NoMemoryEffect (MemoryEffectOpInterface)
Effects: MemoryEffects::Effect{}
FHE.and
(::mlir::concretelang::FHE::BoolAndOp)Applies an AND gate to two encrypted boolean values
Example:
Traits: AlwaysSpeculatableImplTrait
Interfaces: ConditionallySpeculatable, NoMemoryEffect (MemoryEffectOpInterface)
Effects: MemoryEffects::Effect{}
FHE.nand
(::mlir::concretelang::FHE::BoolNandOp)Applies a NAND gate to two encrypted boolean values
Example:
Traits: AlwaysSpeculatableImplTrait
Interfaces: ConditionallySpeculatable, NoMemoryEffect (MemoryEffectOpInterface)
Effects: MemoryEffects::Effect{}
FHE.not
(::mlir::concretelang::FHE::BoolNotOp)Applies a NOT gate to an encrypted boolean value
Example:
Traits: AlwaysSpeculatableImplTrait
Interfaces: ConditionallySpeculatable, NoMemoryEffect (MemoryEffectOpInterface), UnaryEint
Effects: MemoryEffects::Effect{}
FHE.or
(::mlir::concretelang::FHE::BoolOrOp)Applies an OR gate to two encrypted boolean values
Example:
Traits: AlwaysSpeculatableImplTrait
Interfaces: ConditionallySpeculatable, NoMemoryEffect (MemoryEffectOpInterface)
Effects: MemoryEffects::Effect{}
FHE.xor
(::mlir::concretelang::FHE::BoolXorOp)Applies an XOR gate to two encrypted boolean values
Example:
Traits: AlwaysSpeculatableImplTrait
Interfaces: ConditionallySpeculatable, NoMemoryEffect (MemoryEffectOpInterface)
Effects: MemoryEffects::Effect{}
FHE.change_partition
(::mlir::concretelang::FHE::ChangePartitionEintOp)Change partition if necessary.
Changing the partition of a ciphertext. If necessary, it keyswitch the ciphertext to a different key having a different set of parameters than the original one.
Example:
Traits: AlwaysSpeculatableImplTrait
Interfaces: ConditionallySpeculatable, NoMemoryEffect (MemoryEffectOpInterface), UnaryEint
Effects: MemoryEffects::Effect{}
FHE.from_bool
(::mlir::concretelang::FHE::FromBoolOp)Cast a boolean to an unsigned integer
Examples:
Traits: AlwaysSpeculatableImplTrait
Interfaces: ConditionallySpeculatable, NoMemoryEffect (MemoryEffectOpInterface), UnaryEint
Effects: MemoryEffects::Effect{}
FHE.gen_gate
(::mlir::concretelang::FHE::GenGateOp)Applies a truth table based on two boolean inputs
Truth table must be a tensor of four boolean values.
Example:
Traits: AlwaysSpeculatableImplTrait
Interfaces: ConditionallySpeculatable, NoMemoryEffect (MemoryEffectOpInterface)
Effects: MemoryEffects::Effect{}
FHE.lsb
(::mlir::concretelang::FHE::LsbEintOp)Extract the lowest significant bit at a given precision.
This operation extracts the lsb of a ciphertext in a specific precision.
Extracting the lsb with the smallest precision:
Traits: AlwaysSpeculatableImplTrait
Interfaces: ConditionallySpeculatable, ConstantNoise, NoMemoryEffect (MemoryEffectOpInterface), UnaryEint
Effects: MemoryEffects::Effect{}
FHE.max_eint
(::mlir::concretelang::FHE::MaxEintOp)Retrieve the maximum of two encrypted integers.
Retrieve the maximum of two encrypted integers using the formula, 'max(x, y) == max(x - y, 0) + y'. The input and output types should be the same.
If `x - y`` inside the max overflows or underflows, the behavior is undefined. To support the full range, you should increase the bit-width by 1 manually.
Example:
Traits: AlwaysSpeculatableImplTrait
Interfaces: BinaryEint, ConditionallySpeculatable, NoMemoryEffect (MemoryEffectOpInterface)
Effects: MemoryEffects::Effect{}
FHE.mul_eint_int
(::mlir::concretelang::FHE::MulEintIntOp)Multiply an encrypted integer with a clear integer
The clear integer must have one more bit than the encrypted integer and the result must have the same width and the same signedness as the encrypted integer.
Example:
Traits: AlwaysSpeculatableImplTrait
Interfaces: Binary, BinaryEintInt, ConditionallySpeculatable, NoMemoryEffect (MemoryEffectOpInterface)
Effects: MemoryEffects::Effect{}
FHE.mul_eint
(::mlir::concretelang::FHE::MulEintOp)Multiplies two encrypted integers
The encrypted integers and the result must have the same width and signedness. Also, due to the current implementation, one supplementary bit of width must be provided, in addition to the number of bits needed to encode the largest output value.
Example:
Traits: AlwaysSpeculatableImplTrait
Interfaces: BinaryEint, ConditionallySpeculatable, NoMemoryEffect (MemoryEffectOpInterface)
Effects: MemoryEffects::Effect{}
FHE.mux
(::mlir::concretelang::FHE::MuxOp)Multiplexer for two encrypted boolean inputs, based on an encrypted condition
Example:
Traits: AlwaysSpeculatableImplTrait
Interfaces: ConditionallySpeculatable, NoMemoryEffect (MemoryEffectOpInterface)
Effects: MemoryEffects::Effect{}
FHE.neg_eint
(::mlir::concretelang::FHE::NegEintOp)Negates an encrypted integer
The result must have the same width and the same signedness as the encrypted integer.
Example:
Traits: AlwaysSpeculatableImplTrait
Interfaces: ConditionallySpeculatable, MaxNoise, NoMemoryEffect (MemoryEffectOpInterface), UnaryEint
Effects: MemoryEffects::Effect{}
FHE.reinterpret_precision
(::mlir::concretelang::FHE::ReinterpretPrecisionEintOp)Reinterpret the ciphertext with a different precision.
Changing the precision of a ciphertext. It changes both the precision, the value, and in certain cases the correctness of the ciphertext.
Changing to - a bigger precision is always safe. This is equivalent to a shift left for the value. - a smaller precision is only safe if you clear the lowest bits that are discarded. If not, you can assume small errors on the next TLU. This is equivalent to a shift right for the value.
Example:
Traits: AlwaysSpeculatableImplTrait
Interfaces: ConditionallySpeculatable, NoMemoryEffect (MemoryEffectOpInterface), UnaryEint
Effects: MemoryEffects::Effect{}
FHE.round
(::mlir::concretelang::FHE::RoundEintOp)Rounds a ciphertext to a smaller precision.
Assuming a ciphertext whose message is implemented over p
bits, this operation rounds it to fit to q
bits with p>q
.
Example:
Traits: AlwaysSpeculatableImplTrait
Interfaces: ConditionallySpeculatable, NoMemoryEffect (MemoryEffectOpInterface), UnaryEint
Effects: MemoryEffects::Effect{}
FHE.sub_eint_int
(::mlir::concretelang::FHE::SubEintIntOp)Subtract a clear integer from an encrypted integer
The clear integer must have one more bit than the encrypted integer and the result must have the same width and the same signedness as the encrypted integer.
Example:
Traits: AlwaysSpeculatableImplTrait
Interfaces: Binary, BinaryEintInt, ConditionallySpeculatable, MaxNoise, NoMemoryEffect (MemoryEffectOpInterface)
Effects: MemoryEffects::Effect{}
FHE.sub_eint
(::mlir::concretelang::FHE::SubEintOp)Subtract an encrypted integer from an encrypted integer
The encrypted integers and the result must have the same width and the same signedness.
Example:
Traits: AlwaysSpeculatableImplTrait
Interfaces: AdditiveNoise, BinaryEint, ConditionallySpeculatable, NoMemoryEffect (MemoryEffectOpInterface)
Effects: MemoryEffects::Effect{}
FHE.sub_int_eint
(::mlir::concretelang::FHE::SubIntEintOp)Subtract an encrypted integer from a clear integer
The clear integer must have one more bit than the encrypted integer and the result must have the same width and the same signedness as the encrypted integer.
Example:
Traits: AlwaysSpeculatableImplTrait
Interfaces: Binary, BinaryIntEint, ConditionallySpeculatable, MaxNoise, NoMemoryEffect (MemoryEffectOpInterface)
Effects: MemoryEffects::Effect{}
FHE.to_bool
(::mlir::concretelang::FHE::ToBoolOp)Cast an unsigned integer to a boolean
The input must be of width one or two. Two being the current representation of an encrypted boolean, leaving one bit for the carry.
Examples:
Traits: AlwaysSpeculatableImplTrait
Interfaces: ConditionallySpeculatable, NoMemoryEffect (MemoryEffectOpInterface), UnaryEint
Effects: MemoryEffects::Effect{}
FHE.to_signed
(::mlir::concretelang::FHE::ToSignedOp)Cast an unsigned integer to a signed one
The result must have the same width as the input.
The behavior is undefined on overflow/underflow.
Examples:
Traits: AlwaysSpeculatableImplTrait
Interfaces: ConditionallySpeculatable, NoMemoryEffect (MemoryEffectOpInterface), UnaryEint
Effects: MemoryEffects::Effect{}
FHE.to_unsigned
(::mlir::concretelang::FHE::ToUnsignedOp)Cast a signed integer to an unsigned one
The result must have the same width as the input.
The behavior is undefined on overflow/underflow.
Examples:
Traits: AlwaysSpeculatableImplTrait
Interfaces: ConditionallySpeculatable, NoMemoryEffect (MemoryEffectOpInterface), UnaryEint
Effects: MemoryEffects::Effect{}
FHE.zero
(::mlir::concretelang::FHE::ZeroEintOp)Returns a trivial encrypted integer of 0
Example:
Traits: AlwaysSpeculatableImplTrait
Interfaces: ConditionallySpeculatable, NoMemoryEffect (MemoryEffectOpInterface), ZeroNoise
Effects: MemoryEffects::Effect{}
FHE.zero_tensor
(::mlir::concretelang::FHE::ZeroTensorOp)Creates a new tensor with all elements initialized to an encrypted zero.
Creates a new tensor with the shape specified in the result type and initializes its elements with an encrypted zero.
Example:
Traits: AlwaysSpeculatableImplTrait
Interfaces: ConditionallySpeculatable, NoMemoryEffect (MemoryEffectOpInterface), ZeroNoise
Effects: MemoryEffects::Effect{}
An attribute representing a partition.
Syntax:
An encrypted boolean
Syntax: !FHE.ebool
An encrypted boolean.
An encrypted signed integer
An encrypted signed integer with width
bits to performs FHE Operations.
Examples:
An encrypted unsigned integer
An encrypted unsigned integer with width
bits to performs FHE Operations.
Examples:
High Level Fully Homomorphic Encryption Linalg dialect A dialect for representation of high level linalg operations on fully homomorphic ciphertexts.
FHELinalg.add_eint_int
(::mlir::concretelang::FHELinalg::AddEintIntOp)Returns a tensor that contains the addition of a tensor of encrypted integers and a tensor of clear integers.
Performs an addition following the broadcasting rules between a tensor of encrypted integers and a tensor of clear integers. The width of the clear integers must be less than or equal to the width of encrypted integers.
Examples:
Traits: AlwaysSpeculatableImplTrait, TensorBinaryEintInt, TensorBroadcastingRules
Interfaces: Binary, BinaryEintInt, ConditionallySpeculatable, MaxNoise, NoMemoryEffect (MemoryEffectOpInterface)
Effects: MemoryEffects::Effect{}
FHELinalg.add_eint
(::mlir::concretelang::FHELinalg::AddEintOp)Returns a tensor that contains the addition of two tensor of encrypted integers.
Performs an addition following the broadcasting rules between two tensors of encrypted integers. The width of the encrypted integers must be equal.
Examples:
Traits: AlwaysSpeculatableImplTrait, TensorBinaryEint, TensorBroadcastingRules
Interfaces: BinaryEint, ConditionallySpeculatable, NoMemoryEffect (MemoryEffectOpInterface)
Effects: MemoryEffects::Effect{}
FHELinalg.apply_lookup_table
(::mlir::concretelang::FHELinalg::ApplyLookupTableEintOp)Returns a tensor that contains the result of the lookup on a table.
For each encrypted index, performs a lookup table of clear integers.
The %lut
argument must be a tensor with one dimension, where its dimension is 2^p
where p
is the width of the encrypted integers.
Examples:
Traits: AlwaysSpeculatableImplTrait
Interfaces: ConditionallySpeculatable, ConstantNoise, NoMemoryEffect (MemoryEffectOpInterface)
Effects: MemoryEffects::Effect{}
FHELinalg.apply_mapped_lookup_table
(::mlir::concretelang::FHELinalg::ApplyMappedLookupTableEintOp)Returns a tensor that contains the result of the lookup on a table, using a different lookup table for each element, specified by a map.
Performs for each encrypted index a lookup table of clear integers. Multiple lookup tables are passed, and the application of lookup tables is performed following the broadcasting rules. The precise lookup is specified by a map.
Examples:
Others examples: // [0,1] [1, 0] = [3,2] // [3,0] lut [[1,3,5,7], [0,2,4,6]] with [0, 1] = [7,0] // [2,3] [1, 0] = [4,7]
// [0,1] [0, 0] = [1,3] // [3,0] lut [[1,3,5,7], [0,2,4,6]] with [1, 1] = [6,0] // [2,3] [1, 0] = [4,7]
// [0,1] [0] = [1,3] // [3,0] lut [[1,3,5,7], [0,2,4,6]] with [1] = [6,0] // [2,3] [0] = [5,7]
// [0,1] = [1,2] // [3,0] lut [[1,3,5,7], [0,2,4,6]] with [0, 1] = [7,0] // [2,3] = [5,6]
Traits: AlwaysSpeculatableImplTrait
Interfaces: ConditionallySpeculatable, ConstantNoise, NoMemoryEffect (MemoryEffectOpInterface)
Effects: MemoryEffects::Effect{}
FHELinalg.apply_multi_lookup_table
(::mlir::concretelang::FHELinalg::ApplyMultiLookupTableEintOp)Returns a tensor that contains the result of the lookup on a table, using a different lookup table for each element.
Performs for each encrypted index a lookup table of clear integers. Multiple lookup tables are passed, and the application of lookup tables is performed following the broadcasting rules.
The %luts
argument should be a tensor with M dimension, where the first M-1 dimensions are broadcastable with the N dimensions of the encrypted tensor, and where the last dimension dimension is equal to 2^p
where p
is the width of the encrypted integers.
Examples:
Traits: AlwaysSpeculatableImplTrait
Interfaces: ConditionallySpeculatable, ConstantNoise, NoMemoryEffect (MemoryEffectOpInterface)
Effects: MemoryEffects::Effect{}
FHELinalg.broadcast
(::mlir::concretelang::FHELinalg::BroadcastOp)Broadcasts a tensor to a shape.
Broadcasting is used for expanding certain dimensions of a tensor or adding new dimensions to it at the beginning.
An example could be broadcasting a tensor with shape <1x2x1x4x1> to a tensor of shape <6x1x2x3x4x5>.
In this example:
last dimension of the input (1) is expanded to (5)
the dimension before that (4) is kept
the dimension before that (1) is expanded to (3)
the dimension before that (2) is kept
the dimension before that (1) is kept
a new dimension (6) is added to the beginning
See https://numpy.org/doc/stable/user/basics.broadcasting.html#general-broadcasting-rules for the semantics of broadcasting.
Examples:
Traits: AlwaysSpeculatableImplTrait
Interfaces: ConditionallySpeculatable, ConstantNoise, MaxNoise, NoMemoryEffect (MemoryEffectOpInterface)
Effects: MemoryEffects::Effect{}
FHELinalg.concat
(::mlir::concretelang::FHELinalg::ConcatOp)Concatenates a sequence of tensors along an existing axis.
Concatenates several tensors along a given axis.
Examples:
Traits: AlwaysSpeculatableImplTrait
Interfaces: ConditionallySpeculatable, MaxNoise, NoMemoryEffect (MemoryEffectOpInterface)
Effects: MemoryEffects::Effect{}
FHELinalg.conv2d
(::mlir::concretelang::FHELinalg::Conv2dOp)Returns the 2D convolution of a tensor in the form NCHW with weights in the form FCHW
Traits: AlwaysSpeculatableImplTrait
Interfaces: Binary, BinaryEintInt, ConditionallySpeculatable, NoMemoryEffect (MemoryEffectOpInterface)
Effects: MemoryEffects::Effect{}
FHELinalg.dot_eint_int
(::mlir::concretelang::FHELinalg::Dot)Returns the encrypted dot product between a vector of encrypted integers and a vector of clean integers.
Performs a dot product between a vector of encrypted integers and a vector of clear integers.
Examples:
Traits: AlwaysSpeculatableImplTrait
Interfaces: Binary, BinaryEintInt, ConditionallySpeculatable, NoMemoryEffect (MemoryEffectOpInterface)
Effects: MemoryEffects::Effect{}
FHELinalg.dot_eint_eint
(::mlir::concretelang::FHELinalg::DotEint)Returns the encrypted dot product between two vectors of encrypted integers.
Performs a dot product between two vectors of encrypted integers.
Examples:
Traits: AlwaysSpeculatableImplTrait
Interfaces: ConditionallySpeculatable, NoMemoryEffect (MemoryEffectOpInterface)
Effects: MemoryEffects::Effect{}
FHELinalg.fancy_assign
(::mlir::concretelang::FHELinalg::FancyAssignOp)Assigns a tensor into another tensor at a tensor of indices.
Examples:
Notes:
Assigning to the same output position results in undefined behavior.
Traits: AlwaysSpeculatableImplTrait
Interfaces: ConditionallySpeculatable, MaxNoise, NoMemoryEffect (MemoryEffectOpInterface)
Effects: MemoryEffects::Effect{}
FHELinalg.fancy_index
(::mlir::concretelang::FHELinalg::FancyIndexOp)Index into a tensor using a tensor of indices.
Examples:
Traits: AlwaysSpeculatableImplTrait
Interfaces: ConditionallySpeculatable, MaxNoise, NoMemoryEffect (MemoryEffectOpInterface)
Effects: MemoryEffects::Effect{}
FHELinalg.from_element
(::mlir::concretelang::FHELinalg::FromElementOp)Creates a tensor with a single element.
Creates a tensor with a single element.
Traits: AlwaysSpeculatableImplTrait
Interfaces: ConditionallySpeculatable, MaxNoise, NoMemoryEffect (MemoryEffectOpInterface)
Effects: MemoryEffects::Effect{}
FHELinalg.lsb
(::mlir::concretelang::FHELinalg::LsbEintOp)Extract the lowest significant bit at a given precision.
This operation extracts the lsb of a ciphertext tensor in a specific precision.
Extracting only 1 bit:
Traits: AlwaysSpeculatableImplTrait, TensorUnaryEint
Interfaces: ConditionallySpeculatable, ConstantNoise, NoMemoryEffect (MemoryEffectOpInterface), UnaryEint
Effects: MemoryEffects::Effect{}
FHELinalg.matmul_eint_eint
(::mlir::concretelang::FHELinalg::MatMulEintEintOp)Returns a tensor that contains the result of the matrix multiplication of a matrix of encrypted integers and a second matrix of encrypted integers.
Performs a matrix multiplication of a matrix of encrypted integers and a second matrix of encrypted integers.
The behavior depends on the arguments in the following way:
Examples:
Traits: AlwaysSpeculatableImplTrait, TensorBinaryEint
Interfaces: ConditionallySpeculatable, NoMemoryEffect (MemoryEffectOpInterface)
Effects: MemoryEffects::Effect{}
FHELinalg.matmul_eint_int
(::mlir::concretelang::FHELinalg::MatMulEintIntOp)Returns a tensor that contains the result of the matrix multiplication of a matrix of encrypted integers and a matrix of clear integers.
Performs a matrix multiplication of a matrix of encrypted integers and a matrix of clear integers. The width of the clear integers must be less than or equal to the width of encrypted integers.
The behavior depends on the arguments in the following way:
Examples:
Traits: AlwaysSpeculatableImplTrait, TensorBinaryEintInt
Interfaces: Binary, BinaryEintInt, ConditionallySpeculatable, NoMemoryEffect (MemoryEffectOpInterface)
Effects: MemoryEffects::Effect{}
FHELinalg.matmul_int_eint
(::mlir::concretelang::FHELinalg::MatMulIntEintOp)Returns a tensor that contains the result of the matrix multiplication of a matrix of clear integers and a matrix of encrypted integers.
Performs a matrix multiplication of a matrix of clear integers and a matrix of encrypted integers. The width of the clear integers must be less than or equal to the width of encrypted integers.
The behavior depends on the arguments in the following way:
Examples:
Traits: AlwaysSpeculatableImplTrait, TensorBinaryIntEint
Interfaces: Binary, BinaryIntEint, ConditionallySpeculatable, NoMemoryEffect (MemoryEffectOpInterface)
Effects: MemoryEffects::Effect{}
FHELinalg.maxpool2d
(::mlir::concretelang::FHELinalg::Maxpool2dOp)Returns the 2D maxpool of a tensor in the form NCHW
Interfaces: UnaryEint
FHELinalg.mul_eint_int
(::mlir::concretelang::FHELinalg::MulEintIntOp)Returns a tensor that contains the multiplication of a tensor of encrypted integers and a tensor of clear integers.
Performs a multiplication following the broadcasting rules between a tensor of encrypted integers and a tensor of clear integers. The width of the clear integers must be less than or equal to the width of encrypted integers.
Examples:
Traits: AlwaysSpeculatableImplTrait, TensorBinaryEintInt, TensorBroadcastingRules
Interfaces: Binary, BinaryEintInt, ConditionallySpeculatable, NoMemoryEffect (MemoryEffectOpInterface)
Effects: MemoryEffects::Effect{}
FHELinalg.mul_eint
(::mlir::concretelang::FHELinalg::MulEintOp)Returns a tensor that contains the multiplication of two tensor of encrypted integers.
Performs an addition following the broadcasting rules between two tensors of encrypted integers. The width of the encrypted integers must be equal.
Examples:
Traits: AlwaysSpeculatableImplTrait, TensorBinaryEint, TensorBroadcastingRules
Interfaces: BinaryEint, ConditionallySpeculatable, NoMemoryEffect (MemoryEffectOpInterface)
Effects: MemoryEffects::Effect{}
FHELinalg.neg_eint
(::mlir::concretelang::FHELinalg::NegEintOp)Returns a tensor that contains the negation of a tensor of encrypted integers.
Performs a negation to a tensor of encrypted integers.
Examples:
Traits: AlwaysSpeculatableImplTrait, TensorUnaryEint
Interfaces: ConditionallySpeculatable, MaxNoise, NoMemoryEffect (MemoryEffectOpInterface), UnaryEint
Effects: MemoryEffects::Effect{}
FHELinalg.reinterpret_precision
(::mlir::concretelang::FHELinalg::ReinterpretPrecisionEintOp)Reinterpret the ciphertext tensor with a different precision.
It's a reinterpretation cast which changes only the precision. On CRT represention, it does nothing. On Native representation, it moves the message/noise further forward, effectively changing the precision. Changing to - a bigger precision is safe, as the crypto-parameters are chosen such that only zeros will come from the noise part. This is equivalent to a shift left for the value - a smaller precision is only safe if you clear the lowest message bits first. If not, you can assume small errors with high probability and frequent bigger errors, which can be contained to small errors using margins. This is equivalent to a shift right for the value
Example:
Traits: AlwaysSpeculatableImplTrait, TensorUnaryEint
Interfaces: ConditionallySpeculatable, NoMemoryEffect (MemoryEffectOpInterface), UnaryEint
Effects: MemoryEffects::Effect{}
FHELinalg.round
(::mlir::concretelang::FHELinalg::RoundOp)Rounds a tensor of ciphertexts into a smaller precision.
Traits: AlwaysSpeculatableImplTrait, TensorBinaryEintInt, TensorBroadcastingRules
Interfaces: Binary, BinaryEintInt, ConditionallySpeculatable, MaxNoise, NoMemoryEffect (MemoryEffectOpInterface)
Effects: MemoryEffects::Effect{}
FHELinalg.sub_eint
(::mlir::concretelang::FHELinalg::SubEintOp)Returns a tensor that contains the subtraction of two tensor of encrypted integers.
Performs an subtraction following the broadcasting rules between two tensors of encrypted integers. The width of the encrypted integers must be equal.
Examples:
Traits: AlwaysSpeculatableImplTrait, TensorBinaryEint, TensorBroadcastingRules
Interfaces: BinaryEint, ConditionallySpeculatable, NoMemoryEffect (MemoryEffectOpInterface)
Effects: MemoryEffects::Effect{}
FHELinalg.sub_int_eint
(::mlir::concretelang::FHELinalg::SubIntEintOp)Returns a tensor that contains the subtraction of a tensor of clear integers and a tensor of encrypted integers.
Performs a subtraction following the broadcasting rules between a tensor of clear integers and a tensor of encrypted integers. The width of the clear integers must be less than or equal to the width of encrypted integers.
Examples:
Traits: AlwaysSpeculatableImplTrait, TensorBinaryIntEint, TensorBroadcastingRules
Interfaces: Binary, BinaryIntEint, ConditionallySpeculatable, MaxNoise, NoMemoryEffect (MemoryEffectOpInterface)
Effects: MemoryEffects::Effect{}
FHELinalg.sum
(::mlir::concretelang::FHELinalg::SumOp)Returns the sum of elements of a tensor of encrypted integers along specified axes.
Attributes:
keep_dims: boolean = false whether to keep the rank of the tensor after the sum operation if true, reduced axes will have the size of 1
axes: I64ArrayAttr = [] list of dimension to perform the sum along think of it as the dimensions to reduce (see examples below to get an intuition)
Examples:
Traits: AlwaysSpeculatableImplTrait, TensorUnaryEint
Interfaces: ConditionallySpeculatable, NoMemoryEffect (MemoryEffectOpInterface)
Effects: MemoryEffects::Effect{}
FHELinalg.to_signed
(::mlir::concretelang::FHELinalg::ToSignedOp)Cast an unsigned integer tensor to a signed one
Cast an unsigned integer tensor to a signed one. The result must have the same width and the same shape as the input.
The behavior is undefined on overflow/underflow.
Examples:
Traits: AlwaysSpeculatableImplTrait
Interfaces: ConditionallySpeculatable, NoMemoryEffect (MemoryEffectOpInterface), UnaryEint
Effects: MemoryEffects::Effect{}
FHELinalg.to_unsigned
(::mlir::concretelang::FHELinalg::ToUnsignedOp)Cast a signed integer tensor to an unsigned one
Cast a signed integer tensor to an unsigned one. The result must have the same width and the same shape as the input.
The behavior is undefined on overflow/underflow.
Examples:
Traits: AlwaysSpeculatableImplTrait
Interfaces: ConditionallySpeculatable, NoMemoryEffect (MemoryEffectOpInterface), UnaryEint
Effects: MemoryEffects::Effect{}
FHELinalg.transpose
(::mlir::concretelang::FHELinalg::TransposeOp)Returns a tensor that contains the transposition of the input tensor.
Performs a transpose operation on an N-dimensional tensor.
Attributes:
axes: I64ArrayAttr = [] list of dimension to perform the transposition contains a permutation of [0,1,..,N-1] where N is the number of axes think of it as a way to rearrange axes (see the example below)
Examples:
Traits: AlwaysSpeculatableImplTrait
Interfaces: ConditionallySpeculatable, MaxNoise, NoMemoryEffect (MemoryEffectOpInterface), UnaryEint
Effects: MemoryEffects::Effect{}
Compilation of a Python program starts with Concrete's Python frontend, which first traces and transforms it and then converts it into an intermediate representation (IR) that is further processed by Concrete Compiler. This IR is based on the MLIR subproject of the LLVM compiler infrastructure. This document provides an overview of Concrete's FHE-specific representations based on the MLIR framework.
In contrast to traditional infrastructure for compilers, the set of operations and data types that constitute the IR, as well as the level of abstraction that the IR represents, are not fixed in MLIR and can easily be extended. All operations and data types are grouped into dialects, with each dialect representing a specific domain or a specific level of abstraction. Mixing operations and types from different dialects within the same IR is allowed and even encouraged, with all dialects--builtin or developed as an extension--being first-class citizens.
Concrete compiler takes advantage of these concepts by defining a set of dialects, capable of representing an FHE program from an abstract specification that is independent of the actual cryptosystem down to a program that can easily be mapped to function calls of a cryptographic library. The dialects for the representation of an FHE program are:
The FHELinalg Dialect (documentation, source)
The FHE Dialect (documentation, source)
The TFHE Dialect (documentation, source)
The Concrete Dialect (documentation, source)
and for debugging purposes, the Tracing Dialect (documentation, source).
In addition, the project further defines two dialects that help expose dynamic task-parallelism and static data-flow graphs in order to benefit from multi-core, multi-accelerator and distributed systems. These are:
The RT Dialect (documentation, source) and
The SDFG Dialect (documentation, source).
The figure below illustrates the relationship between the dialects and their embedding into the compilation pipeline.
The following sections focus on the FHE-related dialects, i.e., on the FHELinalg Dialect, the FHE Dialect, the TFHE Dialect and the Concrete Dialect.
The top part of the figure shows the components which are involved in the generation of the initial IR, ending with the step labelled MLIR translation. When the initial IR is passed on to Concrete Compiler through its Python bindings, all FHE-related operations are specified using either the FHE or FHELinalg Dialect. Both of these dialects provide operations and data types for the abstract specification of an FHE program, completely independently of a cryptosystem. At this point, the IR simply indicates whether an operand is encrypted (via the type FHE.eint<n>
, where n
stands for the precision in bits) and what operations are applied to encrypted values. Plaintext values simply use MLIR's builtin integer type in
(e.g., i3
or i64
).
The FHE Dialect provides scalar operations on encrypted integers, such as additions (FHE.add_eint
) or multiplications (FHE.mul_eint
), while the FHELinalg Dialect offers operations on tensors of encrypted integers, e.g., matrix products (FHELinalg.matmul_eint_eint
) or convolutions (FHELinalg.conv2d
).
In a first lowering step of the pipeline, all FHELinalg operations are lowered to operations from MLIR's builtin Linalg Dialect using scalar operations from the FHE Dialect. Consider the following example, which consists of a function that performs a multiplication of a matrix of encrypted integers and a matrix of cleartext values:
Upon conversion, the FHELinalg.matmul
operation is converted to a linalg.generic
operation whose body contains a scalar multiplication (FHE.mul_eint_int
) and a scalar addition (FHE.add_eint_int
):
This is then further lowered to a nest of loops from MLIR's SCF Dialect, implementing the parallel and reduction dimensions from the linalg.generic
operation above:
In order to obtain an executable program at the end of the compilation pipeline, the abstract specification of the FHE program must at some point be bound to a specific cryptosystem. This is the role of the TFHE Dialect, whose purpose is:
to indicate operations to be carried out using an implementation of the TFHE cryptosystem
to parametrize the cryptosystem with key sizes, and
to provide a mapping between keys and encrypted values
When lowering the IR based on the FHE Dialect to the TFHE Dialect, the compiler first generates a generic form, in which FHE operations are lowered to TFHE operations and where values are converted to unparametrized TFHE.glwe
values. The unparametrized form TFHE.glwe<sk?>
simply indicates that a TFHE.glwe
value is to be used, but without any indication of the cryptographic parameters and the actual key.
The IR below shows the example program after lowering to unparametrized TFHE:
All operations from the FHE dialect have been replaced with corresponding operations from the TFHE Dialect.
During subsequent parametrization, the compiler can either use a set of default parameters or can obtain a set of parameters from Concrete's optimizer. Either way, an additional pass injects the parameters into the IR, replacing all TFHE.glwe<sk?>
instances with TFHE.glwe<i,d,n>
, where i
is a sequential identifier for a key, d
the number of GLWE dimensions and n
the size of the GLWE polynomial.
The result of such a parametrization for the example is given below:
In this parametrization, a single key with the ID 0
is used, with a single dimension and a polynomial of size 512.
In the next step of the pipeline, operations and types are lowered to the Concrete Dialect. This dialect provides operations, which are implemented by one of Concrete's backend libraries, but still abstracts from any technical details required for interaction with an actual library. The goal is to maintain a high-level representation with value-based semantics and actual operations instead of buffer semantics and library calls, while ensuring that all operations an effectively be lowered to a library call later in the pipeline. However, the abstract types from TFHE are already lowered to tensors of integers with a suitable shape that will hold the binary data of the encrypted values.
The result of the lowering of the example to the Concrete Dialect is shown below:
The remaining stages of the pipeline are rather technical. Before any binding to an actual Concrete backend library, the compiler first invokes MLIR's bufferization infrastructure to convert the value-based IR into an IR with buffer semantics. In particular, this means that keys and encrypted values are no longer abstract values in a mathematical sense, but values backed by a memory location that holds the actual data. This form of IR is then suitable for a pass emitting actual library calls that implement the corresponding operations from the Concrete Dialect for a specific backend.
The result for the example is given below:
At this stage, the IR is only composed of operations from builtin Dialects and thus amenable to lowering to LLVM-IR using the lowering passes provided by MLIR.
Runtime dialect A dialect for representation the abstraction needed for the runtime.
RT.await_future
(::mlir::concretelang::RT::AwaitFutureOp)Wait for a future and access its data.
The results of a dataflow task are always futures which could be further used as inputs to subsequent tasks. When the result of a task is needed in the outer execution context, the result future needs to be synchronized and its data accessed using RT.await_future
.
RT.build_return_ptr_placeholder
(::mlir::concretelang::RT::BuildReturnPtrPlaceholderOp)RT.clone_future
(::mlir::concretelang::RT::CloneFutureOp)Interfaces: AllocationOpInterface, MemoryEffectOpInterface
RT.create_async_task
(::mlir::concretelang::RT::CreateAsyncTaskOp)Create a dataflow task.
RT.dataflow_task
(::mlir::concretelang::RT::DataflowTaskOp)Dataflow task operation
RT.dataflow_task
allows to specify a task that will be concurrently executed when their operands are ready. Operands are either the results of computation in other RT.dataflow_task
(dataflow dependences) or obtained from the execution context (immediate operands). Operands are synchronized using futures and, in the case of immediate operands, copied when the task is created. Caution is required when the operand is a pointer as no deep copy will occur.
Example:
Traits: AutomaticAllocationScope, SingleBlockImplicitTerminator
Interfaces: AllocationOpInterface, MemoryEffectOpInterface, RegionBranchOpInterface
RT.dataflow_yield
(::mlir::concretelang::RT::DataflowYieldOp)Dataflow yield operation
RT.dataflow_yield
is a special terminator operation for blocks inside the region in RT.dataflow_task
. It allows to specify the return values of a RT.dataflow_task
.
Example:
Traits: ReturnLike, Terminator
RT.deallocate_future_data
(::mlir::concretelang::RT::DeallocateFutureDataOp)RT.deallocate_future
(::mlir::concretelang::RT::DeallocateFutureOp)RT.deref_return_ptr_placeholder
(::mlir::concretelang::RT::DerefReturnPtrPlaceholderOp)RT.deref_work_function_argument_ptr_placeholder
(::mlir::concretelang::RT::DerefWorkFunctionArgumentPtrPlaceholderOp)RT.make_ready_future
(::mlir::concretelang::RT::MakeReadyFutureOp)Build a ready future.
Data passed to dataflow tasks must be encapsulated in futures, including immediate operands. These must be converted into futures using RT.make_ready_future
.
Interfaces: AllocationOpInterface, MemoryEffectOpInterface
RT.register_task_work_function
(::mlir::concretelang::RT::RegisterTaskWorkFunctionOp)Register the task work-function with the runtime system.
RT.work_function_return
(::mlir::concretelang::RT::WorkFunctionReturnOp)Future with a parameterized element type
The value of a !RT.future
type represents the result of an asynchronous operation.
Examples:
Pointer to a parameterized element type
Low Level Fully Homomorphic Encryption dialect A dialect for representation of low level operation on fully homomorphic ciphertext.
Concrete.add_lwe_buffer
(::mlir::concretelang::Concrete::AddLweBufferOp)Returns the sum of 2 lwe ciphertexts
Concrete.add_lwe_tensor
(::mlir::concretelang::Concrete::AddLweTensorOp)Returns the sum of 2 lwe ciphertexts
Traits: AlwaysSpeculatableImplTrait
Interfaces: ConditionallySpeculatable, NoMemoryEffect (MemoryEffectOpInterface)
Effects: MemoryEffects::Effect{}
Concrete.add_plaintext_lwe_buffer
(::mlir::concretelang::Concrete::AddPlaintextLweBufferOp)Returns the sum of a clear integer and an lwe ciphertext
Concrete.add_plaintext_lwe_tensor
(::mlir::concretelang::Concrete::AddPlaintextLweTensorOp)Returns the sum of a clear integer and an lwe ciphertext
Traits: AlwaysSpeculatableImplTrait
Interfaces: ConditionallySpeculatable, NoMemoryEffect (MemoryEffectOpInterface)
Effects: MemoryEffects::Effect{}
Concrete.batched_add_lwe_buffer
(::mlir::concretelang::Concrete::BatchedAddLweBufferOp)Batched version of AddLweBufferOp, which performs the same operation on multiple elements
Concrete.batched_add_lwe_tensor
(::mlir::concretelang::Concrete::BatchedAddLweTensorOp)Batched version of AddLweTensorOp, which performs the same operation on multiple elements
Traits: AlwaysSpeculatableImplTrait
Interfaces: ConditionallySpeculatable, NoMemoryEffect (MemoryEffectOpInterface)
Effects: MemoryEffects::Effect{}
Concrete.batched_add_plaintext_cst_lwe_buffer
(::mlir::concretelang::Concrete::BatchedAddPlaintextCstLweBufferOp)Batched version of AddPlaintextLweBufferOp, which performs the same operation on multiple elements
Concrete.batched_add_plaintext_cst_lwe_tensor
(::mlir::concretelang::Concrete::BatchedAddPlaintextCstLweTensorOp)Batched version of AddPlaintextLweTensorOp, which performs the same operation on multiple elements
Traits: AlwaysSpeculatableImplTrait
Interfaces: ConditionallySpeculatable, NoMemoryEffect (MemoryEffectOpInterface)
Effects: MemoryEffects::Effect{}
Concrete.batched_add_plaintext_lwe_buffer
(::mlir::concretelang::Concrete::BatchedAddPlaintextLweBufferOp)Batched version of AddPlaintextLweBufferOp, which performs the same operation on multiple elements
Concrete.batched_add_plaintext_lwe_tensor
(::mlir::concretelang::Concrete::BatchedAddPlaintextLweTensorOp)Batched version of AddPlaintextLweTensorOp, which performs the same operation on multiple elements
Traits: AlwaysSpeculatableImplTrait
Interfaces: ConditionallySpeculatable, NoMemoryEffect (MemoryEffectOpInterface)
Effects: MemoryEffects::Effect{}
Concrete.batched_bootstrap_lwe_buffer
(::mlir::concretelang::Concrete::BatchedBootstrapLweBufferOp)Batched version of BootstrapLweOp, which performs the same operation on multiple elements
Concrete.batched_bootstrap_lwe_tensor
(::mlir::concretelang::Concrete::BatchedBootstrapLweTensorOp)Batched version of BootstrapLweOp, which performs the same operation on multiple elements
Traits: AlwaysSpeculatableImplTrait
Interfaces: ConditionallySpeculatable, NoMemoryEffect (MemoryEffectOpInterface)
Effects: MemoryEffects::Effect{}
Concrete.batched_keyswitch_lwe_buffer
(::mlir::concretelang::Concrete::BatchedKeySwitchLweBufferOp)Batched version of KeySwitchLweOp, which performs the same operation on multiple elements
Concrete.batched_keyswitch_lwe_tensor
(::mlir::concretelang::Concrete::BatchedKeySwitchLweTensorOp)Batched version of KeySwitchLweOp, which performs the same operation on multiple elements
Traits: AlwaysSpeculatableImplTrait
Interfaces: ConditionallySpeculatable, NoMemoryEffect (MemoryEffectOpInterface)
Effects: MemoryEffects::Effect{}
Concrete.batched_mapped_bootstrap_lwe_buffer
(::mlir::concretelang::Concrete::BatchedMappedBootstrapLweBufferOp)Batched, mapped version of BootstrapLweOp, which performs the same operation on multiple elements
Concrete.batched_mapped_bootstrap_lwe_tensor
(::mlir::concretelang::Concrete::BatchedMappedBootstrapLweTensorOp)Batched, mapped version of BootstrapLweOp, which performs the same operation on multiple elements
Traits: AlwaysSpeculatableImplTrait
Interfaces: ConditionallySpeculatable, NoMemoryEffect (MemoryEffectOpInterface)
Effects: MemoryEffects::Effect{}
Concrete.batched_mul_cleartext_cst_lwe_buffer
(::mlir::concretelang::Concrete::BatchedMulCleartextCstLweBufferOp)Batched version of MulCleartextLweBufferOp, which performs the same operation on multiple elements
Concrete.batched_mul_cleartext_cst_lwe_tensor
(::mlir::concretelang::Concrete::BatchedMulCleartextCstLweTensorOp)Batched version of MulCleartextLweTensorOp, which performs the same operation on multiple elements
Traits: AlwaysSpeculatableImplTrait
Interfaces: ConditionallySpeculatable, NoMemoryEffect (MemoryEffectOpInterface)
Effects: MemoryEffects::Effect{}
Concrete.batched_mul_cleartext_lwe_buffer
(::mlir::concretelang::Concrete::BatchedMulCleartextLweBufferOp)Batched version of MulCleartextLweBufferOp, which performs the same operation on multiple elements
Concrete.batched_mul_cleartext_lwe_tensor
(::mlir::concretelang::Concrete::BatchedMulCleartextLweTensorOp)Batched version of MulCleartextLweTensorOp, which performs the same operation on multiple elements
Traits: AlwaysSpeculatableImplTrait
Interfaces: ConditionallySpeculatable, NoMemoryEffect (MemoryEffectOpInterface)
Effects: MemoryEffects::Effect{}
Concrete.batched_negate_lwe_buffer
(::mlir::concretelang::Concrete::BatchedNegateLweBufferOp)Batched version of NegateLweBufferOp, which performs the same operation on multiple elements
Concrete.batched_negate_lwe_tensor
(::mlir::concretelang::Concrete::BatchedNegateLweTensorOp)Batched version of NegateLweTensorOp, which performs the same operation on multiple elements
Traits: AlwaysSpeculatableImplTrait
Interfaces: ConditionallySpeculatable, NoMemoryEffect (MemoryEffectOpInterface)
Effects: MemoryEffects::Effect{}
Concrete.bootstrap_lwe_buffer
(::mlir::concretelang::Concrete::BootstrapLweBufferOp)Bootstraps a LWE ciphertext with a GLWE trivial encryption of the lookup table
Concrete.bootstrap_lwe_tensor
(::mlir::concretelang::Concrete::BootstrapLweTensorOp)Bootstraps an LWE ciphertext with a GLWE trivial encryption of the lookup table
Traits: AlwaysSpeculatableImplTrait
Interfaces: ConditionallySpeculatable, NoMemoryEffect (MemoryEffectOpInterface)
Effects: MemoryEffects::Effect{}
Concrete.encode_expand_lut_for_bootstrap_buffer
(::mlir::concretelang::Concrete::EncodeExpandLutForBootstrapBufferOp)Encode and expand a lookup table so that it can be used for a bootstrap
Concrete.encode_expand_lut_for_bootstrap_tensor
(::mlir::concretelang::Concrete::EncodeExpandLutForBootstrapTensorOp)Encode and expand a lookup table so that it can be used for a bootstrap
Traits: AlwaysSpeculatableImplTrait
Interfaces: ConditionallySpeculatable, NoMemoryEffect (MemoryEffectOpInterface)
Effects: MemoryEffects::Effect{}
Concrete.encode_lut_for_crt_woppbs_buffer
(::mlir::concretelang::Concrete::EncodeLutForCrtWopPBSBufferOp)Encode and expand a lookup table so that it can be used for a crt wop pbs
Concrete.encode_lut_for_crt_woppbs_tensor
(::mlir::concretelang::Concrete::EncodeLutForCrtWopPBSTensorOp)Encode and expand a lookup table so that it can be used for a wop pbs
Traits: AlwaysSpeculatableImplTrait
Interfaces: ConditionallySpeculatable, NoMemoryEffect (MemoryEffectOpInterface)
Effects: MemoryEffects::Effect{}
Concrete.encode_plaintext_with_crt_buffer
(::mlir::concretelang::Concrete::EncodePlaintextWithCrtBufferOp)Encodes a plaintext by decomposing it on a crt basis
Concrete.encode_plaintext_with_crt_tensor
(::mlir::concretelang::Concrete::EncodePlaintextWithCrtTensorOp)Encodes a plaintext by decomposing it on a crt basis
Traits: AlwaysSpeculatableImplTrait
Interfaces: ConditionallySpeculatable, NoMemoryEffect (MemoryEffectOpInterface)
Effects: MemoryEffects::Effect{}
Concrete.keyswitch_lwe_buffer
(::mlir::concretelang::Concrete::KeySwitchLweBufferOp)Performs a keyswitching operation on an LWE ciphertext
Concrete.keyswitch_lwe_tensor
(::mlir::concretelang::Concrete::KeySwitchLweTensorOp)Performs a keyswitching operation on an LWE ciphertext
Traits: AlwaysSpeculatableImplTrait
Interfaces: ConditionallySpeculatable, NoMemoryEffect (MemoryEffectOpInterface)
Effects: MemoryEffects::Effect{}
Concrete.mul_cleartext_lwe_buffer
(::mlir::concretelang::Concrete::MulCleartextLweBufferOp)Returns the product of a clear integer and a lwe ciphertext
Concrete.mul_cleartext_lwe_tensor
(::mlir::concretelang::Concrete::MulCleartextLweTensorOp)Returns the product of a clear integer and a lwe ciphertext
Traits: AlwaysSpeculatableImplTrait
Interfaces: ConditionallySpeculatable, NoMemoryEffect (MemoryEffectOpInterface)
Effects: MemoryEffects::Effect{}
Concrete.negate_lwe_buffer
(::mlir::concretelang::Concrete::NegateLweBufferOp)Negates an lwe ciphertext
Concrete.negate_lwe_tensor
(::mlir::concretelang::Concrete::NegateLweTensorOp)Negates an lwe ciphertext
Traits: AlwaysSpeculatableImplTrait
Interfaces: ConditionallySpeculatable, NoMemoryEffect (MemoryEffectOpInterface)
Effects: MemoryEffects::Effect{}
Concrete.wop_pbs_crt_lwe_buffer
(::mlir::concretelang::Concrete::WopPBSCRTLweBufferOp)Concrete.wop_pbs_crt_lwe_tensor
(::mlir::concretelang::Concrete::WopPBSCRTLweTensorOp)Traits: AlwaysSpeculatableImplTrait
Interfaces: ConditionallySpeculatable, NoMemoryEffect (MemoryEffectOpInterface)
Effects: MemoryEffects::Effect{}
A runtime context
Syntax: !Concrete.context
An abstract runtime context to pass contextual value, like public keys, ...
Tracing dialect A dialect to print program values at runtime.
Tracing.trace_ciphertext
(::mlir::concretelang::Tracing::TraceCiphertextOp)Prints a ciphertext.
Tracing.trace_message
(::mlir::concretelang::Tracing::TraceMessageOp)Prints a message.
Tracing.trace_plaintext
(::mlir::concretelang::Tracing::TracePlaintextOp)Prints a plaintext.
Dialect for the construction of static data flow graphs A dialect for the construction of static data flow graphs. The data flow graph is composed of a set of processes, connected through data streams. Special streams allow for data to be injected into and to be retrieved from the data flow graph.
SDFG.get
(::mlir::concretelang::SDFG::Get)Retrieves a data element from a stream
Retrieves a single data element from the specified stream (i.e., an instance of the element type of the stream).
Example:
SDFG.init
(::mlir::concretelang::SDFG::Init)Initializes the streaming framework
Initializes the streaming framework. This operation must be performed before control reaches any other operation from the dialect.
Example:
SDFG.make_process
(::mlir::concretelang::SDFG::MakeProcess)Creates a new SDFG process
Creates a new SDFG process and connects it to the input and output streams.
Example:
SDFG.make_stream
(::mlir::concretelang::SDFG::MakeStream)Returns a new SDFG stream
Returns a new SDFG stream, transporting data either between processes on the device, from the host to the device or from the device to the host. All streams are typed, allowing data to be read / written through SDFG.get
and SDFG.put
only using the stream's type.
Example:
SDFG.put
(::mlir::concretelang::SDFG::Put)Writes a data element to a stream
Writes the input operand to the specified stream. The operand's type must meet the element type of the stream.
Example:
SDFG.shutdown
(::mlir::concretelang::SDFG::Shutdown)Shuts down the streaming framework
Shuts down the streaming framework. This operation must be performed after any other operation from the dialect.
Example:
SDFG.start
(::mlir::concretelang::SDFG::Start)Finalizes the creation of an SDFG and starts execution of its processes
Finalizes the creation of an SDFG and starts execution of its processes. Any creation of streams and processes must take place before control reaches this operation.
Example:
Process kind
Syntax:
Stream kind
Syntax:
An SDFG data flow graph
Syntax: !SDFG.dfg
A handle to an SDFG data flow graph
An SDFG data stream
An SDFG stream to connect SDFG processes.
a
b
integer
«unnamed»
a
b
«unnamed»
a
lut
tensor of integer values
«unnamed»
left
An encrypted boolean
right
An encrypted boolean
«unnamed»
An encrypted boolean
left
An encrypted boolean
right
An encrypted boolean
«unnamed»
An encrypted boolean
value
An encrypted boolean
«unnamed»
An encrypted boolean
left
An encrypted boolean
right
An encrypted boolean
«unnamed»
An encrypted boolean
left
An encrypted boolean
right
An encrypted boolean
«unnamed»
An encrypted boolean
src
::mlir::concretelang::FHE::PartitionAttr
An attribute representing a partition.
dest
::mlir::concretelang::FHE::PartitionAttr
An attribute representing a partition.
input
«unnamed»
input
An encrypted boolean
«unnamed»
An encrypted unsigned integer
left
An encrypted boolean
right
An encrypted boolean
truth_table
tensor of integer values
«unnamed»
An encrypted boolean
input
«unnamed»
x
y
«unnamed»
a
b
integer
«unnamed»
rhs
lhs
«unnamed»
cond
An encrypted boolean
c1
An encrypted boolean
c2
An encrypted boolean
«unnamed»
An encrypted boolean
a
«unnamed»
input
«unnamed»
input
«unnamed»
a
b
integer
«unnamed»
a
b
«unnamed»
a
integer
b
«unnamed»
input
An encrypted unsigned integer
«unnamed»
An encrypted boolean
input
An encrypted unsigned integer
«unnamed»
An encrypted signed integer
input
An encrypted signed integer
«unnamed»
An encrypted unsigned integer
out
tensor
name
StringAttr
lweDim
uint64_t
glweDim
uint64_t
polySize
uint64_t
pbsBaseLog
uint64_t
pbsLevel
uint64_t
width
unsigned
width
unsigned
lhs
rhs
«unnamed»
lhs
rhs
«unnamed»
t
lut
«unnamed»
t
luts
map
«unnamed»
t
luts
«unnamed»
input
output
axis
::mlir::IntegerAttr
64-bit signless integer attribute
ins
out
padding
::mlir::DenseIntElementsAttr
64-bit signless integer elements attribute
strides
::mlir::DenseIntElementsAttr
64-bit signless integer elements attribute
dilations
::mlir::DenseIntElementsAttr
64-bit signless integer elements attribute
group
::mlir::IntegerAttr
64-bit signless integer attribute
input
weight
bias
«unnamed»
lhs
rhs
out
lhs
rhs
out
input
indices
values
output
input
indices
output
«unnamed»
any type
«unnamed»
input
output
lhs
rhs
«unnamed»
lhs
rhs
«unnamed»
lhs
rhs
«unnamed»
kernel_shape
::mlir::DenseIntElementsAttr
64-bit signless integer elements attribute
strides
::mlir::DenseIntElementsAttr
64-bit signless integer elements attribute
dilations
::mlir::DenseIntElementsAttr
64-bit signless integer elements attribute
input
«unnamed»
lhs
rhs
«unnamed»
lhs
rhs
«unnamed»
input
«unnamed»
input
output
lhs
rhs
«unnamed»
lhs
rhs
«unnamed»
lhs
rhs
«unnamed»
axes
::mlir::ArrayAttr
64-bit integer array attribute
keep_dims
::mlir::BoolAttr
bool attribute
tensor
out
input
output
input
output
axes
::mlir::ArrayAttr
64-bit integer array attribute
tensor
any type
«unnamed»
any type
input
Future with a parameterized element type
output
any type
output
Pointer to a parameterized element type
input
Future with a parameterized element type
output
Future with a parameterized element type
workfn
::mlir::SymbolRefAttr
symbol reference attribute
list
any type
inputs
any type
outputs
any type
values
any type
input
Future with a parameterized element type
input
any type
input
Pointer to a parameterized element type
output
Future with a parameterized element type
input
Pointer to a parameterized element type
output
any type
input
any type
memrefCloned
any type
output
Future with a parameterized element type
list
any type
in
any type
out
any type
elementType
Type
elementType
Type
result
1D memref of 64-bit signless integer values
lhs
1D memref of 64-bit signless integer values
rhs
1D memref of 64-bit signless integer values
lhs
1D tensor of 64-bit signless integer values
rhs
1D tensor of 64-bit signless integer values
result
1D tensor of 64-bit signless integer values
result
1D memref of 64-bit signless integer values
lhs
1D memref of 64-bit signless integer values
rhs
64-bit signless integer
lhs
1D tensor of 64-bit signless integer values
rhs
64-bit signless integer
result
1D tensor of 64-bit signless integer values
result
2D memref of 64-bit signless integer values
lhs
2D memref of 64-bit signless integer values
rhs
2D memref of 64-bit signless integer values
lhs
2D tensor of 64-bit signless integer values
rhs
2D tensor of 64-bit signless integer values
result
2D tensor of 64-bit signless integer values
result
2D memref of 64-bit signless integer values
lhs
2D memref of 64-bit signless integer values
rhs
64-bit signless integer
lhs
2D tensor of 64-bit signless integer values
rhs
64-bit signless integer
result
2D tensor of 64-bit signless integer values
result
2D memref of 64-bit signless integer values
lhs
2D memref of 64-bit signless integer values
rhs
1D memref of 64-bit signless integer values
lhs
2D tensor of 64-bit signless integer values
rhs
1D tensor of 64-bit signless integer values
result
2D tensor of 64-bit signless integer values
inputLweDim
::mlir::IntegerAttr
32-bit signless integer attribute
polySize
::mlir::IntegerAttr
32-bit signless integer attribute
level
::mlir::IntegerAttr
32-bit signless integer attribute
baseLog
::mlir::IntegerAttr
32-bit signless integer attribute
glweDimension
::mlir::IntegerAttr
32-bit signless integer attribute
bskIndex
::mlir::IntegerAttr
32-bit signless integer attribute
result
2D memref of 64-bit signless integer values
input_ciphertext
2D memref of 64-bit signless integer values
lookup_table
1D memref of 64-bit signless integer values
inputLweDim
::mlir::IntegerAttr
32-bit signless integer attribute
polySize
::mlir::IntegerAttr
32-bit signless integer attribute
level
::mlir::IntegerAttr
32-bit signless integer attribute
baseLog
::mlir::IntegerAttr
32-bit signless integer attribute
glweDimension
::mlir::IntegerAttr
32-bit signless integer attribute
bskIndex
::mlir::IntegerAttr
32-bit signless integer attribute
input_ciphertext
2D tensor of 64-bit signless integer values
lookup_table
1D tensor of 64-bit signless integer values
result
2D tensor of 64-bit signless integer values
level
::mlir::IntegerAttr
32-bit signless integer attribute
baseLog
::mlir::IntegerAttr
32-bit signless integer attribute
lwe_dim_in
::mlir::IntegerAttr
32-bit signless integer attribute
lwe_dim_out
::mlir::IntegerAttr
32-bit signless integer attribute
kskIndex
::mlir::IntegerAttr
32-bit signless integer attribute
result
2D memref of 64-bit signless integer values
ciphertext
2D memref of 64-bit signless integer values
level
::mlir::IntegerAttr
32-bit signless integer attribute
baseLog
::mlir::IntegerAttr
32-bit signless integer attribute
lwe_dim_in
::mlir::IntegerAttr
32-bit signless integer attribute
lwe_dim_out
::mlir::IntegerAttr
32-bit signless integer attribute
kskIndex
::mlir::IntegerAttr
32-bit signless integer attribute
ciphertext
2D tensor of 64-bit signless integer values
result
2D tensor of 64-bit signless integer values
inputLweDim
::mlir::IntegerAttr
32-bit signless integer attribute
polySize
::mlir::IntegerAttr
32-bit signless integer attribute
level
::mlir::IntegerAttr
32-bit signless integer attribute
baseLog
::mlir::IntegerAttr
32-bit signless integer attribute
glweDimension
::mlir::IntegerAttr
32-bit signless integer attribute
bskIndex
::mlir::IntegerAttr
32-bit signless integer attribute
result
2D memref of 64-bit signless integer values
input_ciphertext
2D memref of 64-bit signless integer values
lookup_table_vector
2D memref of 64-bit signless integer values
inputLweDim
::mlir::IntegerAttr
32-bit signless integer attribute
polySize
::mlir::IntegerAttr
32-bit signless integer attribute
level
::mlir::IntegerAttr
32-bit signless integer attribute
baseLog
::mlir::IntegerAttr
32-bit signless integer attribute
glweDimension
::mlir::IntegerAttr
32-bit signless integer attribute
bskIndex
::mlir::IntegerAttr
32-bit signless integer attribute
input_ciphertext
2D tensor of 64-bit signless integer values
lookup_table_vector
2D tensor of 64-bit signless integer values
result
2D tensor of 64-bit signless integer values
result
2D memref of 64-bit signless integer values
lhs
2D memref of 64-bit signless integer values
rhs
64-bit signless integer
lhs
2D tensor of 64-bit signless integer values
rhs
64-bit signless integer
result
2D tensor of 64-bit signless integer values
result
2D memref of 64-bit signless integer values
lhs
2D memref of 64-bit signless integer values
rhs
1D memref of 64-bit signless integer values
lhs
2D tensor of 64-bit signless integer values
rhs
1D tensor of 64-bit signless integer values
result
2D tensor of 64-bit signless integer values
result
2D memref of 64-bit signless integer values
ciphertext
2D memref of 64-bit signless integer values
ciphertext
2D tensor of 64-bit signless integer values
result
2D tensor of 64-bit signless integer values
inputLweDim
::mlir::IntegerAttr
32-bit signless integer attribute
polySize
::mlir::IntegerAttr
32-bit signless integer attribute
level
::mlir::IntegerAttr
32-bit signless integer attribute
baseLog
::mlir::IntegerAttr
32-bit signless integer attribute
glweDimension
::mlir::IntegerAttr
32-bit signless integer attribute
bskIndex
::mlir::IntegerAttr
32-bit signless integer attribute
result
1D memref of 64-bit signless integer values
input_ciphertext
1D memref of 64-bit signless integer values
lookup_table
1D memref of 64-bit signless integer values
inputLweDim
::mlir::IntegerAttr
32-bit signless integer attribute
polySize
::mlir::IntegerAttr
32-bit signless integer attribute
level
::mlir::IntegerAttr
32-bit signless integer attribute
baseLog
::mlir::IntegerAttr
32-bit signless integer attribute
glweDimension
::mlir::IntegerAttr
32-bit signless integer attribute
bskIndex
::mlir::IntegerAttr
32-bit signless integer attribute
input_ciphertext
1D tensor of 64-bit signless integer values
lookup_table
1D tensor of 64-bit signless integer values
result
1D tensor of 64-bit signless integer values
polySize
::mlir::IntegerAttr
32-bit signless integer attribute
outputBits
::mlir::IntegerAttr
32-bit signless integer attribute
isSigned
::mlir::BoolAttr
bool attribute
result
1D memref of 64-bit signless integer values
input_lookup_table
1D memref of 64-bit signless integer values
polySize
::mlir::IntegerAttr
32-bit signless integer attribute
outputBits
::mlir::IntegerAttr
32-bit signless integer attribute
isSigned
::mlir::BoolAttr
bool attribute
input_lookup_table
1D tensor of 64-bit signless integer values
result
1D tensor of 64-bit signless integer values
crtDecomposition
::mlir::ArrayAttr
64-bit integer array attribute
crtBits
::mlir::ArrayAttr
64-bit integer array attribute
modulusProduct
::mlir::IntegerAttr
32-bit signless integer attribute
isSigned
::mlir::BoolAttr
bool attribute
result
2D memref of 64-bit signless integer values
input_lookup_table
1D memref of 64-bit signless integer values
crtDecomposition
::mlir::ArrayAttr
64-bit integer array attribute
crtBits
::mlir::ArrayAttr
64-bit integer array attribute
modulusProduct
::mlir::IntegerAttr
32-bit signless integer attribute
isSigned
::mlir::BoolAttr
bool attribute
input_lookup_table
1D tensor of 64-bit signless integer values
result
2D tensor of 64-bit signless integer values
mods
::mlir::ArrayAttr
64-bit integer array attribute
modsProd
::mlir::IntegerAttr
64-bit signless integer attribute
result
1D memref of 64-bit signless integer values
input
64-bit signless integer
mods
::mlir::ArrayAttr
64-bit integer array attribute
modsProd
::mlir::IntegerAttr
64-bit signless integer attribute
input
64-bit signless integer
result
1D tensor of 64-bit signless integer values
level
::mlir::IntegerAttr
32-bit signless integer attribute
baseLog
::mlir::IntegerAttr
32-bit signless integer attribute
lwe_dim_in
::mlir::IntegerAttr
32-bit signless integer attribute
lwe_dim_out
::mlir::IntegerAttr
32-bit signless integer attribute
kskIndex
::mlir::IntegerAttr
32-bit signless integer attribute
result
1D memref of 64-bit signless integer values
ciphertext
1D memref of 64-bit signless integer values
level
::mlir::IntegerAttr
32-bit signless integer attribute
baseLog
::mlir::IntegerAttr
32-bit signless integer attribute
lwe_dim_in
::mlir::IntegerAttr
32-bit signless integer attribute
lwe_dim_out
::mlir::IntegerAttr
32-bit signless integer attribute
kskIndex
::mlir::IntegerAttr
32-bit signless integer attribute
ciphertext
1D tensor of 64-bit signless integer values
result
1D tensor of 64-bit signless integer values
result
1D memref of 64-bit signless integer values
lhs
1D memref of 64-bit signless integer values
rhs
64-bit signless integer
lhs
1D tensor of 64-bit signless integer values
rhs
64-bit signless integer
result
1D tensor of 64-bit signless integer values
result
1D memref of 64-bit signless integer values
ciphertext
1D memref of 64-bit signless integer values
ciphertext
1D tensor of 64-bit signless integer values
result
1D tensor of 64-bit signless integer values
bootstrapLevel
::mlir::IntegerAttr
32-bit signless integer attribute
bootstrapBaseLog
::mlir::IntegerAttr
32-bit signless integer attribute
keyswitchLevel
::mlir::IntegerAttr
32-bit signless integer attribute
keyswitchBaseLog
::mlir::IntegerAttr
32-bit signless integer attribute
packingKeySwitchInputLweDimension
::mlir::IntegerAttr
32-bit signless integer attribute
packingKeySwitchoutputPolynomialSize
::mlir::IntegerAttr
32-bit signless integer attribute
packingKeySwitchLevel
::mlir::IntegerAttr
32-bit signless integer attribute
packingKeySwitchBaseLog
::mlir::IntegerAttr
32-bit signless integer attribute
circuitBootstrapLevel
::mlir::IntegerAttr
32-bit signless integer attribute
circuitBootstrapBaseLog
::mlir::IntegerAttr
32-bit signless integer attribute
crtDecomposition
::mlir::ArrayAttr
64-bit integer array attribute
kskIndex
::mlir::IntegerAttr
32-bit signless integer attribute
bskIndex
::mlir::IntegerAttr
32-bit signless integer attribute
pkskIndex
::mlir::IntegerAttr
32-bit signless integer attribute
result
2D memref of 64-bit signless integer values
ciphertext
2D memref of 64-bit signless integer values
lookup_table
2D memref of 64-bit signless integer values
bootstrapLevel
::mlir::IntegerAttr
32-bit signless integer attribute
bootstrapBaseLog
::mlir::IntegerAttr
32-bit signless integer attribute
keyswitchLevel
::mlir::IntegerAttr
32-bit signless integer attribute
keyswitchBaseLog
::mlir::IntegerAttr
32-bit signless integer attribute
packingKeySwitchInputLweDimension
::mlir::IntegerAttr
32-bit signless integer attribute
packingKeySwitchoutputPolynomialSize
::mlir::IntegerAttr
32-bit signless integer attribute
packingKeySwitchLevel
::mlir::IntegerAttr
32-bit signless integer attribute
packingKeySwitchBaseLog
::mlir::IntegerAttr
32-bit signless integer attribute
circuitBootstrapLevel
::mlir::IntegerAttr
32-bit signless integer attribute
circuitBootstrapBaseLog
::mlir::IntegerAttr
32-bit signless integer attribute
crtDecomposition
::mlir::ArrayAttr
64-bit integer array attribute
kskIndex
::mlir::IntegerAttr
32-bit signless integer attribute
bskIndex
::mlir::IntegerAttr
32-bit signless integer attribute
pkskIndex
::mlir::IntegerAttr
32-bit signless integer attribute
ciphertext
2D tensor of 64-bit signless integer values
lookupTable
2D tensor of 64-bit signless integer values
result
2D tensor of 64-bit signless integer values
msg
::mlir::StringAttr
string attribute
nmsb
::mlir::IntegerAttr
32-bit signless integer attribute
ciphertext
msg
::mlir::StringAttr
string attribute
msg
::mlir::StringAttr
string attribute
nmsb
::mlir::IntegerAttr
32-bit signless integer attribute
plaintext
integer
stream
An SDFG data stream
data
any type
«unnamed»
An SDFG data flow graph
type
::mlir::concretelang::SDFG::ProcessKindAttr
Process kind
dfg
An SDFG data flow graph
streams
An SDFG data stream
name
::mlir::StringAttr
string attribute
type
::mlir::concretelang::SDFG::StreamKindAttr
Stream kind
dfg
An SDFG data flow graph
«unnamed»
An SDFG data stream
stream
An SDFG data stream
data
any type
dfg
An SDFG data flow graph
dfg
An SDFG data flow graph
value
::mlir::concretelang::SDFG::ProcessKind
an enum of type ProcessKind
value
::mlir::concretelang::SDFG::StreamKind
an enum of type StreamKind
elementType
Type
This document gives an overview of the benchmarking infrastructure of Concrete.
Concrete Python uses progress-tracker-python to do benchmarks. Please refer to its README to learn how it works.
Use the makefile target:
Note that this command removes the previous benchmark results before doing the benchmark.
Since the full benchmark suite takes a long time to run, it's not recommended for development. Instead, use the following command to run just a single benchmark.
This command would only run the benchmarks defined in benchmarks/foo.py
. It also retains the previous runs, so it can be run back to back to collect data from multiple benchmarks.
Simply add a new Python script in benchmarks
directory and write your logic.
The recommended file structure is as follows:
Feel free to check benchmarks/primitive.py
to see this structure in action.
After doing a compilation, we end up with a couple of artifacts, including crypto parameters and a binary file containing the executable circuit. In order to be able to encrypt and run the circuit properly, we need to know how to interpret these artifacts, and there are a couple of utility functions which can be used to load them. These utility functions can be accessed through a variety of languages, including Python and C++.
We will use a really simple example for a demo, but the same steps can be done for any other circuit. example.mlir
will contain the MLIR below:
You can use the concretecompiler
binary to compile this MLIR program. Same can be done with concrete-python
, as we only need the compilation artifacts at the end.
You should be able to see artifacts listed in the python-demo
directory
Now we want to use the Python bindings in order to call the compiled circuit.
The main struct
to manage compilation artifacts is LibrarySupport
. You will have to create one with the path you used during compilation, then load the result of the compilation
Using the compilation result, you can load the server lambda (the entrypoint to the executable compiled circuit) as well as the client parameters (containing crypto parameters)
The client parameters will serve the client to generate keys and encrypt arguments for the circuit
Only evaluation keys are required for the execution of the circuit. You can execute the circuit on the encrypted arguments via server_lambda_call
At this point you have the encrypted result and can decrypt it using the keyset which holds the secret key
There is also a couple of tests in test_compilation.py that can show how to both compile and run a circuit between a client and server using serialization.
This document gives an overview of the structure of the examples, which are tutorials containing more or less elaborated usages of Concrete, to showcase its functionality on practical use cases. Examples are either provided as a Python script or a Jupyter notebook.
Create examples/foo/foo.ipynb
Write the example in the notebook
The notebook will be executed in the CI with make test-notebooks
target
Create examples/foo/foo.py
Write the example in the script
Example should contain a class called Foo
Foo
should have the following arguments in its __init__
:
configuration: Optional[fhe.Configuration] = None
compiled: bool = True
It should compile the circuit with an appropriate inputset using the given configuration if compiled is true
It should have any additional common utilities (e.g., encoding/decoding) shared between the tests and the benchmarks
Then, add tests for the implementation in tests/execution/test_examples.py
Optionally, create benchmarks/foo.py
and add benchmarks.
This document explains how Zama people can release a new version of Concrete.
All releases should be done on a release branch: our release branches are named release/MAJOR.MINOR.x
(eg, release/2.7.x
):
either you create a new version, then you need to create the new release branch (eg, the previous release was 2.6.x and now we release 2.7.0)
or you create a dot release: in this case you should cherry-pick commits on the branch of the release you want to fix (eg, the previous release was 2.7.0 and now we release 2.7.1).
The release/MAJOR.MINOR.x
branch will be the branch from where all releases vMAJOR.MINOR.*
will be done, and from where the gitbook documentation is built https://docs.zama.ai/concrete/v/MAJOR.MINOR
.
Each push on the release branch will start all tests of Concrete. When you are happy with the state of the release branch, you need to update the API documentation:
If you miss it, the release worflow will stops on the release-checks
steps on concrete_python_release.yml
. Don't forget to push the updated API docs in the branch.
Then you just need to tag.
This new tag push will start the release workflow: the workflow builds all release artifacts then create a new draft release on GitHub which you can find at https://github.com/zama-ai/concrete/releases/tag/vMAJOR.MINOR.REVISION
.
You should edit the changelog and the release documentation, then make it reviewed by the product marketing team.
When the new release documentation has been reviewed, you may save the release as a non draft release, then publish wheels on pypi using the https://github.com/zama-ai/concrete/actions/workflows/push_wheels_to_public_pypi.yml
workflow, by setting the version number as MAJOR.MINOR.VERSION
.
Follow the summary checklist:
At the end, check all the artifacts:
Fundamentals
Explore the core features.
Guides
Deploy your project.
Tutorials
Learn more with tutorials.