This document details the GPU performance benchmarks of programmable bootstrapping and keyswitch operations using TFHE-rs.
All GPU benchmarks were launched on H100 GPUs, and rely on the multithreaded PBS algorithm with a grouping factor set to 3.
TFHE-rs benchmarks can be easily reproduced from the source.
The following example shows how to reproduce TFHE-rs benchmarks:
This document details the GPU performance benchmarks of homomorphic operations using TFHE-rs.
By their nature, homomorphic operations run slower than their cleartext equivalents.
All GPU benchmarks were launched on H100 GPUs, and rely on the multithreaded PBS algorithm.
This document details the GPU performance benchmarks of homomorphic operations on integers using TFHE-rs.
All GPU benchmarks were launched on H100 GPUs, and rely on the multithreaded PBS algorithm.
The cryptographic parameters PARAM_GPU_MULTI_BIT_MESSAGE_2_CARRY_2_GROUP_3_KS_PBS
were used.
Below come the results for the execution on a single H100. The following table shows the performance when the inputs of the benchmarked operation are encrypted:
The following table shows the performance when the left input of the benchmarked operation is encrypted and the other is a clear scalar of the same size:
Below come the results for the execution on two H100's. The following table shows the performance when the inputs of the benchmarked operation are encrypted:
The following table shows the performance when the left input of the benchmarked operation is encrypted and the other is a clear scalar of the same size:
TFHE-rs benchmarks can be easily reproduced from the source.
The following example shows how to reproduce TFHE-rs benchmarks: