Benchmarks

Due to their nature, homomorphic operations are naturally slower than their cleartext equivalents. Some timings are exposed for basic operations. For completeness, benchmarks for other libraries are also given.

All CPU benchmarks were launched on an AWS hpc7a.96xlarge instance with the following specifications: AMD EPYC 9R14 CPU @ 2.60GHz and 740GB of RAM.

Integer

This measures the execution time for some operation sets of tfhe-rs::integer (the unsigned version). Note that the timings for FheInt (i.e., the signed integers) are similar.

The table below reports the timing on CPU when the inputs of the benchmarked operation are encrypted.

The table below reports the timing on CPU when the left input of the benchmarked operation is encrypted and the other is a clear scalar of the same size.

All timings are related to parallelized Radix-based integer operations, where each block is encrypted using the default parameters (i.e., PARAM_MESSAGE_2_CARRY_2_KS_PBS, more information about parameters can be found here). To ensure predictable timings, the operation flavor is the default one: the carry is propagated if needed. The operation costs may be reduced by using unchecked, checked, or smart.

Benchmark results on GPU for all these operations can be consulted here.

Shortint

This measures the execution time for some operations using various parameter sets of tfhe-rs::shortint. Except for unchecked_add, all timings are related to the default operations. This flavor ensures predictable timings for an operation along the entire circuit by clearing the carry space after each operation.

This uses the Concrete FFT + AVX-512 configuration.

Boolean

This measures the execution time of a single binary Boolean gate.

tfhe-rs::boolean.

tfhe-lib.

Using the same hpc7a.96xlarge machine as the one for tfhe-rs, the timings are:

OpenFHE (v1.1.2).

Following the official instructions from OpenFHE, clang14 and the following command are used to setup the project: cmake -DNATIVE_SIZE=32 -DWITH_NATIVEOPT=ON -DCMAKE_C_COMPILER=clang -DCMAKE_CXX_COMPILER=clang++ -DWITH_OPENMP=OFF ..

To use the HEXL library, the configuration used is as follows:

export CXX=clang++
export CC=clang

scripts/configure.sh
Release -> y
hexl -> y

scripts/build-openfhe-development-hexl.sh

Using the same hpc7a.96xlarge machine as the one for tfhe-rs, the timings are:

How to reproduce TFHE-rs benchmarks

TFHE-rs benchmarks can be easily reproduced from source.

AVX512 is now enabled by default for benchmarks when available

#Boolean benchmarks:
make bench_boolean

#Integer benchmarks:
make bench_integer

#Shortint benchmarks:
make bench_shortint

Last updated