This document details the GPU performance benchmarks of programmable bootstrapping and keyswitch operations using TFHE-rs.
All GPU benchmarks were launched on H100 GPUs, and rely on the multithreaded PBS algorithm with a grouping factor set to 3.
TFHE-rs benchmarks can be easily reproduced from the source.
The following example shows how to reproduce TFHE-rs benchmarks: