Benchmarks
Due to their nature, homomorphic operations are naturally slower than their cleartext equivalents. Some timings are exposed for basic operations. For completeness, benchmarks for other libraries are also given.
All benchmarks were launched on an AWS m6i.metal with the following specifications: Intel(R) Xeon(R) Platinum 8375C CPU @ 2.90GHz and 512GB of RAM.
Integer
This measures the execution time for some operation sets of tfhe-rs::integer (the unsigned version). Note that the timings for FheInt
(i.e., the signed integers) are similar.
Operation \ Size |
|
|
|
|
|
|
Negation ( | 70.9 ms | 99.3 ms | 129 ms | 180 ms | 239 ms | 333 ms |
Add / Sub ( | 70.5 ms | 100 ms | 132 ms | 186 ms | 249 ms | 334 ms |
Mul ( | 144 ms | 216 ms | 333 ms | 832 ms | 2.50 s | 8.85 s |
Equal / Not Equal ( | 36.1 ms | 36.5 ms | 57.4 ms | 64.2 ms | 67.3 ms | 78.1 ms |
Comparisons ( | 52.6 ms | 73.1 ms | 98.8 ms | 124 ms | 165 ms | 201 ms |
Max / Min ( | 76.2 ms | 102 ms | 135 ms | 171 ms | 212 ms | 301 ms |
Bitwise operations ( | 19.4 ms | 20.3 ms | 21.0 ms | 27.2 ms | 31.6 ms | 40.2 ms |
Div / Rem ( | 729 ms | 1.93 s | 4.81 s | 12.2 s | 30.7 s | 89.6 s |
Left / Right Shifts ( | 99.4 ms | 129 ms | 180 ms | 243 ms | 372 ms | 762 ms |
Left / Right Rotations ( | 103 ms | 128 ms | 182 ms | 241 ms | 374 ms | 763 ms |
All timings are related to parallelized Radix-based integer operations, where each block is encrypted using the default parameters (i.e., PARAM_MESSAGE_2_CARRY_2_KS_PBS, more information about parameters can be found here). To ensure predictable timings, the operation flavor is the default
one: the carry is propagated if needed. The operation costs may be reduced by using unchecked
, checked
, or smart
.
Shortint
This measures the execution time for some operations using various parameter sets of tfhe-rs::shortint. Except for unchecked_add
, all timings are related to the default
operations. This flavor ensures predictable timings for an operation along the entire circuit by clearing the carry space after each operation.
This uses the Concrete FFT + AVX-512 configuration.
Parameter set | PARAM_MESSAGE_1_CARRY_1 | PARAM_MESSAGE_2_CARRY_2 | PARAM_MESSAGE_3_CARRY_3 | PARAM_MESSAGE_4_CARRY_4 |
---|---|---|---|---|
unchecked_add | 348 ns | 413 ns | 2.95 µs | 12.1 µs |
add | 7.59 ms | 17.0 ms | 121 ms | 835 ms |
mul_lsb | 8.13 ms | 16.8 ms | 121 ms | 827 ms |
keyswitch_programmable_bootstrap | 7.28 ms | 16.6 ms | 121 ms | 811 ms |
Boolean
This measures the execution time of a single binary Boolean gate.
tfhe-rs::boolean.
Parameter set | Concrete FFT + AVX-512 |
---|---|
DEFAULT_PARAMETERS_KS_PBS | 9.19 ms |
PARAMETERS_ERROR_PROB_2_POW_MINUS_165_KS_PBS | 14.1 ms |
TFHE_LIB_PARAMETERS | 10.0 ms |
tfhe-lib.
Using the same m6i.metal machine as the one for tfhe-rs, the timings are:
Parameter set | spqlios-fma |
---|---|
default_128bit_gate_bootstrapping_parameters | 15.4 ms |
OpenFHE (v1.1.1).
Following the official instructions from OpenFHE, clang14
and the following command are used to setup the project: cmake -DNATIVE_SIZE=32 -DWITH_NATIVEOPT=ON -DCMAKE_C_COMPILER=clang -DCMAKE_CXX_COMPILER=clang++ -DWITH_OPENMP=OFF ..
To use the HEXL library, the configuration used is as follows:
Using the same m6i.metal machine as the one for tfhe-rs, the timings are:
Parameter set | GINX | GINX w/ Intel HEXL |
---|---|---|
FHEW_BINGATE/STD128_OR | 40.2 ms | 31.0 ms |
FHEW_BINGATE/STD128_LMKCDEY_OR | 38.6 ms | 28.4 ms |
How to reproduce TFHE-rs benchmarks
TFHE-rs benchmarks can be easily reproduced from source.
If the host machine does not support AVX512, then turning on AVX512_SUPPORT
will not provide any speed-up.
Last updated