TFHE-rs
WebsiteLibrariesProduct & ServicesDevelopersSupport
1.0
1.0
  • Welcome to TFHE-rs
  • Get Started
    • What is TFHE-rs?
    • Installation
    • Quick start
    • Benchmarks
      • CPU Benchmarks
        • Integer
        • Programmable bootstrapping
      • GPU Benchmarks
        • Integer
        • Programmable bootstrapping
      • Zero-knowledge proof benchmarks
    • Security and cryptography
  • FHE Computation
    • Types
      • Integer
      • Strings
      • Array
    • Operations
      • Arithmetic operations
      • Bitwise operations
      • Comparison operations
      • Min/Max operations
      • Ternary conditional operations
      • Casting operations
      • Boolean Operations
      • String Operations
    • Core workflow
      • Configuration and key generation
      • Server key
      • Encryption
      • Decryption
      • Parameters
    • Data handling
      • Compressing ciphertexts/keys
      • Serialization/deserialization
      • Data versioning
    • Advanced features
      • Encrypted pseudo random values
      • Overflow detection
      • Public key encryption
      • Trivial ciphertexts
      • Zero-knowledge proofs
      • Multi-threading with Rayon crate
    • Tooling
      • PBS statistics
      • Generic trait bounds
      • Debugging
  • Configuration
    • Advanced Rust setup
    • GPU acceleration
    • Parallelized PBS
  • Integration
    • JS on WASM API
    • High-level API in C
  • Tutorials
    • Homomorphic parity bit
    • Homomorphic case changing on Ascii string
    • SHA256 with Boolean API
    • All tutorials
  • References
    • API references
    • Fine-grained APIs
      • Quick start
      • Boolean
        • Operations
        • Cryptographic parameters
        • Serialization/Deserialization
      • Shortint
        • Operations
        • Cryptographic parameters
        • Serialization/Deserialization
      • Integer
        • Operations
        • Cryptographic parameters
        • Serialization/Deserialization
    • Core crypto API
      • Quick start
      • Tutorial
  • Explanations
    • TFHE deep dive
  • Developers
    • Contributing
    • Release note
    • Feature request
    • Bug report
Powered by GitBook

Libraries

  • TFHE-rs
  • Concrete
  • Concrete ML
  • fhEVM

Developers

  • Blog
  • Documentation
  • Github
  • FHE resources

Company

  • About
  • Introduction to FHE
  • Media
  • Careers
On this page
  • Prerequisites
  • Importing to your project
  • Supported platforms
  • A first example
  • Configuring and creating keys.
  • Setting the keys
  • Encryption
  • Computation
  • Decryption
  • List of available operations
  • Multi-GPU support
  • Benchmark
  • Warning
  • Compressing ciphertexts after some homomorphic computation on the GPU
  • Array types

Was this helpful?

Export as PDF
  1. Configuration

GPU acceleration

PreviousAdvanced Rust setupNextParallelized PBS

Last updated 2 months ago

Was this helpful?

This guide explains how to update your existing program to leverage GPU acceleration, or to start a new program using GPU.

TFHE-rs now supports a GPU backend with CUDA implementation, enabling integer arithmetic operations on encrypted data.

Prerequisites

  • Cuda version >= 10

  • Compute Capability >= 3.0

  • >= 8.0 - check this for more details about nvcc/gcc compatible versions

  • >= 3.24

  • libclang, to match Rust bingen >= 9.0

  • Rust version - check this

Importing to your project

To use the TFHE-rs GPU backend in your project, add the following dependency in your Cargo.toml.

tfhe = { version = "~1.0.1", features = ["boolean", "shortint", "integer", "gpu"] }

For optimal performance when using TFHE-rs, run your code in release mode with the --release flag.

Supported platforms

TFHE-rs GPU backend is supported on Linux (x86, aarch64).

OS
x86
aarch64

Linux

Supported

Supported*

macOS

Unsupported

Unsupported*

Windows

Unsupported

Unsupported

A first example

Configuring and creating keys.

Here is a full example (combining the client and server parts):

use tfhe::{ConfigBuilder, set_server_key, FheUint8, ClientKey, CompressedServerKey};
use tfhe::prelude::*;

fn main() {

    let config = ConfigBuilder::default().build();

    let client_key= ClientKey::generate(config);
    let compressed_server_key = CompressedServerKey::new(&client_key);

    let gpu_key = compressed_server_key.decompress_to_gpu();

    let clear_a = 27u8;
    let clear_b = 128u8;

    let a = FheUint8::encrypt(clear_a, &client_key);
    let b = FheUint8::encrypt(clear_b, &client_key);

    //Server-side

    set_server_key(gpu_key);
    let result = a + b;

    //Client-side
    let decrypted_result: u8 = result.decrypt(&client_key);

    let clear_result = clear_a + clear_b;

    assert_eq!(decrypted_result, clear_result);
}

Beware that when the GPU feature is activated, when calling: let config = ConfigBuilder::default().build();, the cryptographic parameters differ from the CPU ones, used when the GPU feature is not activated. Indeed, TFHE-rs uses dedicated parameters for the GPU in order to achieve better performance.

Setting the keys

The configuration of the key is different from the CPU. More precisely, if both client and server keys are still generated by the client (which is assumed to run on a CPU), the server key has then to be decompressed by the server to be converted into the right format. To do so, the server should run this function: decompressed_to_gpu().

Once decompressed, the operations between CPU and GPU are identical.

Encryption

On the client-side, the method to encrypt the data is exactly the same than the CPU one, as shown in the following example:

    let clear_a = 27u8;
    let clear_b = 128u8;
    
    let a = FheUint8::encrypt(clear_a, &client_key);
    let b = FheUint8::encrypt(clear_b, &client_key);

Computation

The server first need to set up its keys with set_server_key(gpu_key).

    //Server-side
    set_server_key(gpu_key);
    let result = a + b;

    //Client-side
    let decrypted_result: u8 = result.decrypt(&client_key);

    let clear_result = clear_a + clear_b;

    assert_eq!(decrypted_result, clear_result);

Decryption

Finally, the client decrypts the results using:

    let decrypted_result: u8 = result.decrypt(&client_key);

List of available operations

The GPU backend includes the following operations for both signed and unsigned encrypted integers:

name

symbol

Enc/Enc

Enc/ Int

Neg

-

N/A

Add

+

Sub

-

Mul

*

Div

/

Rem

%

Not

!

N/A

BitAnd

&

BitOr

|

BitXor

^

Shr

>>

Shl

<<

Rotate right

rotate_right

Rotate left

rotate_left

Min

min

Max

max

Greater than

gt

Greater or equal than

ge

Lower than

lt

Lower or equal than

le

Equal

eq

Cast (into dest type)

cast_into

N/A

Cast (from src type)

cast_from

N/A

Ternary operator

select

Multi-GPU support

TFHE-rs supports platforms with multiple GPUs. There is nothing to change in the code to execute on such platforms. To keep the API as user-friendly as possible, the configuration is automatically set, i.e., the user has no fine-grained control over the number of GPUs to be used.

Benchmark

Warning

When measuring GPU times on your own on Linux, set the environment variable CUDA_MODULE_LOADING=EAGER to avoid CUDA API overheads during the first kernel execution.

Compressing ciphertexts after some homomorphic computation on the GPU

The way to do it is very similar to how it's done on the CPU. The following example shows how to compress and decompress a list containing 4 messages:

  • One 32-bits integer

  • One 64-bit integer

  • One Boolean

  • One 2-bit integer

use tfhe::prelude::*;
use tfhe::shortint::parameters::{
    COMP_PARAM_MESSAGE_2_CARRY_2, PARAM_GPU_MULTI_BIT_MESSAGE_2_CARRY_2_GROUP_3_KS_PBS,
};
use tfhe::{
    set_server_key, CompressedCiphertextList, CompressedCiphertextListBuilder, FheBool,
    FheInt64, FheUint16, FheUint2, FheUint32,
};

fn main() {
    let config =
        tfhe::ConfigBuilder::with_custom_parameters(PARAM_GPU_MULTI_BIT_MESSAGE_2_CARRY_2_GROUP_3_KS_PBS)
            .enable_compression(COMP_PARAM_MESSAGE_2_CARRY_2)
            .build();

    let ck = tfhe::ClientKey::generate(config);
    let compressed_server_key = tfhe::CompressedServerKey::new(&ck);
    let gpu_key = compressed_server_key.decompress_to_gpu();

    set_server_key(gpu_key);

    let ct1 = FheUint32::encrypt(17_u32, &ck);

    let ct2 = FheInt64::encrypt(-1i64, &ck);

    let ct3 = FheBool::encrypt(false, &ck);

    let ct4 = FheUint2::encrypt(3u8, &ck);

    let compressed_list = CompressedCiphertextListBuilder::new()
        .push(ct1)
        .push(ct2)
        .push(ct3)
        .push(ct4)
        .build()
        .unwrap();

    let serialized = bincode::serialize(&compressed_list).unwrap();

    println!("Serialized size: {} bytes", serialized.len());

    let compressed_list: CompressedCiphertextList = bincode::deserialize(&serialized).unwrap();

    let a: FheUint32 = compressed_list.get(0).unwrap().unwrap();
    let b: FheInt64 = compressed_list.get(1).unwrap().unwrap();
    let c: FheBool = compressed_list.get(2).unwrap().unwrap();
    let d: FheUint2 = compressed_list.get(3).unwrap().unwrap();

    let a: u32 = a.decrypt(&ck);
    assert_eq!(a, 17);
    let b: i64 = b.decrypt(&ck);
    assert_eq!(b, -1);
    let c = c.decrypt(&ck);
    assert!(!c);
    let d: u8 = d.decrypt(&ck);
    assert_eq!(d, 3);

}

Array types

use tfhe::{ConfigBuilder, set_server_key, ClearArray, ClientKey, CompressedServerKey};
use tfhe::array::GpuFheUint32Array;
use tfhe::prelude::*;

fn main() {
    let config = ConfigBuilder::default().build();

    let cks = ClientKey::generate(config);
    let compressed_server_key = CompressedServerKey::new(&cks);

    let gpu_key = compressed_server_key.decompress_to_gpu();
    set_server_key(gpu_key);

    let num_elems = 4 * 4;
    let clear_xs = (0..num_elems as u32).collect::<Vec<_>>();
    let clear_ys = vec![1u32; num_elems];

    // Encrypted 2D array with values
    // [[  0,  1,  2,  3]
    //  [  4,  5,  6,  7]
    //  [  8,  9, 10, 11]
    //  [ 12, 13, 14, 15]]
    let xs = GpuFheUint32Array::try_encrypt((clear_xs.as_slice(), vec![4, 4]), &cks).unwrap();
    // Encrypted 2D array with values
    // [[  1,  1,  1,  1]
    //  [  1,  1,  1,  1]
    //  [  1,  1,  1,  1]
    //  [  1,  1,  1,  1]]
    let ys = GpuFheUint32Array::try_encrypt((clear_ys.as_slice(), vec![4, 4]), &cks).unwrap();

    assert_eq!(xs.num_dim(), 2);
    assert_eq!(xs.shape(), &[4, 4]);
    assert_eq!(ys.num_dim(), 2);
    assert_eq!(ys.shape(), &[4, 4]);

    // Take a sub slice
    //  [[ 10, 11]
    //   [ 14, 15]]
    let xss = xs.slice(&[2..4, 2..4]);
    // Take a sub slice
    //  [[  1,  1]
    //   [  1,  1]]
    let yss = ys.slice(&[2..4, 2..4]);

    assert_eq!(xss.num_dim(), 2);
    assert_eq!(xss.shape(), &[2, 2]);
    assert_eq!(yss.num_dim(), 2);
    assert_eq!(yss.shape(), &[2, 2]);

    let r = &xss + &yss;

    // Result is
    //  [[ 11, 12]
    //   [ 15, 16]]
    let result: Vec<u32> = r.decrypt(&cks);
    assert_eq!(result, vec![11, 12, 15, 16]);

    // Clear 2D array with values
    //  [[  10,  20]
    //   [  30,  40]]
    let clear_array = ClearArray::new(vec![10u32, 20u32, 30u32, 40u32], vec![2, 2]);
    let r = &xss + &clear_array;

    // Result is
    //  [[ 20, 31]
    //   [ 44, 55]]
    let r: Vec<u32> = r.decrypt(&cks);
    assert_eq!(r, vec![20, 31, 44, 55]);
}

Comparing to the , GPU set up differs in the key creation, as detailed

Then, homomorphic computations are performed using the same approach as the .

All operations follow the same syntax than the one described in .

Please refer to the for detailed performance benchmark results.

You can compress ciphertexts using the GPU, even after computations, just like on the .

It is possible to use array types on GPU, just as . Here is an example showing how to do it:

gcc
page
cmake
requirements
page
CPU operations
here
GPU benchmarks
on CPU
CPU example
here
✔️
✔️
✔️
✔️
✔️
✔️
✔️
✔️
✔️
✔️
✔️
✔️
✔️
✔️
✔️
✔️
✔️
✔️
✔️
✔️
✔️
✔️
✔️
✔️
✔️
✔️
✔️
✔️
✔️
✔️
✔️
✔️
✔️
✔️
✔️
✔️
✔️
✔️
✔️
✔️
✖️
✖️
✔️
✖️
CPU