A simple example
The example shown in this section computes the sum of two integers using the GPU. It contains code that can be split into a client-side and a server-side part, but for simplicity it is shown as a single snippet. Only the server-side benefits from GPU acceleration.
This example shows how to use a single GPU to improve operation latency. It has the following structure:
Client-side: Generate client keys and GPU server keys. Encrypt two numbers
Server-side: Move server keys to GPU and perform the addition
Client-side: Decrypt the result
This example only performs an addition, but most FHE operations are supported on GPU. For a list see:
OperationsAPI elements discussed in this document
tfhe::ConfigBuilder::default()
: Instantiates the default cryptographic parameters. When the"gpu"
feature is activated, the default parameters are GPU specific, which achieves optimal performance on GPUtfhe::ServerKey::decompress_to_gpu
: decompresses a compressed ServerKey and copies it to all available GPUstfhe::set_server_key
: sets the current server key. When this is a GPU key, this function activates execution of integer operations on all GPUs assigned to this key.
A simple TFHE-rs program
use tfhe::{ConfigBuilder, set_server_key, FheUint8, ClientKey, CompressedServerKey};
use tfhe::prelude::*;
fn main() {
let config = ConfigBuilder::default().build();
let client_key= ClientKey::generate(config);
let compressed_server_key = CompressedServerKey::new(&client_key);
let gpu_key = compressed_server_key.decompress_to_gpu();
let clear_a = 27u8;
let clear_b = 128u8;
let a = FheUint8::encrypt(clear_a, &client_key);
let b = FheUint8::encrypt(clear_b, &client_key);
//Server-side
set_server_key(gpu_key);
let result = a + b;
//Client-side
let decrypted_result: u8 = result.decrypt(&client_key);
let clear_result = clear_a + clear_b;
assert_eq!(decrypted_result, clear_result);
}
When the "gpu"
feature is activated, calling: let config = ConfigBuilder::default().build();
instantiates cryptographic parameters that are different from the CPU ones.
Breakdown of the GPU TFHE-rs program
Key generation
Comparing to the CPU example, in the code snippet above, the server-side must call decompress_to_gpu
to enable GPU-execution for the ensuing operations on ciphertexts. This function assigns all available GPUs to the server key.
let gpu_key = compressed_server_key.decompress_to_gpu();
Once the key is decompressed to GPU and set with set_server_key
, operations on ciphertexts execute on the GPU. In the example above:
compressed_server_key
is aCompressedServerKey
, stored on CPU. The client-side should ensure this key is generated with GPU cryptographic parameters.gpu_key
is theCudaServerKey
corresponding tocompressed_server_key
and is stored on the GPU assigned to it.set_server_key
sets either a CPU or GPU key. In this example,compressed_server_key
andgpu_key
have GPU cryptographic parameters. A GPU server key can enable automatic parallelization on multiple GPUs.
Encryption
On the client-side, the method to encrypt the data is exactly the same as the CPU one, as shown in the following example:
let clear_a = 27u8;
let clear_b = 128u8;
let a = FheUint8::encrypt(clear_a, &client_key);
let b = FheUint8::encrypt(clear_b, &client_key);
Server-side computation
The server first needs to set up its keys with set_server_key(gpu_key)
. Then, homomorphic computations are performed using the same approach as the CPU operations.
//Server-side
set_server_key(gpu_key);
let result = a + b;
//Client-side
let decrypted_result: u8 = result.decrypt(&client_key);
let clear_result = clear_a + clear_b;
assert_eq!(decrypted_result, clear_result);
Decryption
Finally, the client decrypts the results using:
let decrypted_result: u8 = result.decrypt(&client_key);
Optimizing for throughput
In order to improve operation throughput, you can use multiple GPUs with fine-grained GPU scheduling, as detailed on the following page:
Multi-GPU supportLast updated
Was this helpful?