GPU acceleration
This guide explains how to update your existing program to leverage GPU acceleration, or to start a new program using GPU.
TFHE-rs now supports a GPU backend with CUDA implementation, enabling integer arithmetic operations on encrypted data.
Prerequisites
Cuda version >= 10
Compute Capability >= 3.0
cmake >= 3.24
libclang, to match Rust bingen requirements >= 9.0
Rust version - check this page
Importing to your project
To use the TFHE-rs GPU backend in your project, add the following dependency in your Cargo.toml
.
For optimal performance when using TFHE-rs, run your code in release mode with the --release
flag.
Supported platforms
TFHE-rs GPU backend is supported on Linux (x86, aarch64).
Linux
Supported
Supported*
macOS
Unsupported
Unsupported*
Windows
Unsupported
Unsupported
A first example
Configuring and creating keys.
Comparing to the CPU example, GPU set up differs in the key creation, as detailed here
Here is a full example (combining the client and server parts):
Beware that when the GPU feature is activated, when calling: let config = ConfigBuilder::default().build();
, the cryptographic parameters differ from the CPU ones, used when the GPU feature is not activated. Indeed, TFHE-rs uses dedicated parameters for the GPU in order to achieve better performance.
Setting the keys
The configuration of the key is different from the CPU. More precisely, if both client and server keys are still generated by the client (which is assumed to run on a CPU), the server key has then to be decompressed by the server to be converted into the right format. To do so, the server should run this function: decompressed_to_gpu()
.
Once decompressed, the operations between CPU and GPU are identical.
Encryption
On the client-side, the method to encrypt the data is exactly the same than the CPU one, as shown in the following example:
Computation
The server first need to set up its keys with set_server_key(gpu_key)
.
Then, homomorphic computations are performed using the same approach as the CPU operations.
Decryption
Finally, the client decrypts the results using:
List of available operations
The GPU backend includes the following operations for both signed and unsigned encrypted integers:
name
symbol
Enc
/Enc
Enc
/ Int
Neg
-
✔️
N/A
Add
+
✔️
✔️
Sub
-
✔️
✔️
Mul
*
✔️
✔️
Div
/
✔️
✔️
Rem
%
✔️
✔️
Not
!
✔️
N/A
BitAnd
&
✔️
✔️
BitOr
|
✔️
✔️
BitXor
^
✔️
✔️
Shr
>>
✔️
✔️
Shl
<<
✔️
✔️
Rotate right
rotate_right
✔️
✔️
Rotate left
rotate_left
✔️
✔️
Min
min
✔️
✔️
Max
max
✔️
✔️
Greater than
gt
✔️
✔️
Greater or equal than
ge
✔️
✔️
Lower than
lt
✔️
✔️
Lower or equal than
le
✔️
✔️
Equal
eq
✔️
✔️
Cast (into dest type)
cast_into
✖️
N/A
Cast (from src type)
cast_from
✖️
N/A
Ternary operator
select
✔️
✖️
All operations follow the same syntax than the one described in here.
Multi-GPU support
TFHE-rs supports platforms with multiple GPUs. There is nothing to change in the code to execute on such platforms. To keep the API as user-friendly as possible, the configuration is automatically set, i.e., the user has no fine-grained control over the number of GPUs to be used.
Benchmark
Please refer to the GPU benchmarks for detailed performance benchmark results.
Warning
When measuring GPU times on your own on Linux, set the environment variable CUDA_MODULE_LOADING=EAGER
to avoid CUDA API overheads during the first kernel execution.
Compressing ciphertexts after some homomorphic computation on the GPU
You can compress ciphertexts using the GPU, even after computations, just like on the CPU.
The way to do it is very similar to how it's done on the CPU. The following example shows how to compress and decompress a list containing 4 messages:
One 32-bits integer
One 64-bit integer
One Boolean
One 2-bit integer
Array types
It is possible to use array types on GPU, just as on CPU. Here is an example showing how to do it:
Last updated