Concrete
WebsiteLibrariesProducts & ServicesDevelopersSupport
2.10
2.10
  • Welcome
  • Get Started
    • What is Concrete?
    • Installation
    • Quick start
    • Quick overview
    • Terminology
  • Operations
    • Table Lookups basics
    • Non-linear operations
    • Other operations
      • Bit extraction
      • Common tips
      • Extensions
  • Compilation
    • Combining compiled functions
      • With composition
      • With modules
    • Key-related options for faster execution
      • Multi precision
      • Multi parameters
    • Compression
    • Reusing arguments
    • Parameter compatibility with restrictions
    • Common errors
  • Execution / Analysis
    • Simulation
    • Debugging and artifact
    • Performance
    • GPU acceleration
    • Other
      • Statistics
      • Progressbar
      • Formatting and drawing
  • Guides
    • Configure
    • Manage keys
    • Deploy
    • TFHE-rs Interoperability
      • Shared key
      • Serialization
    • Optimization
      • Improve parallelism
        • Dataflow parallelism
        • Tensorizing operations
      • Optimize table lookups
        • Reducing TLU
        • Implementation strategies
        • Round/truncating
        • Approximate mode
        • Bit extraction
      • Optimize cryptographic parameters
        • Error probability
        • Composition
  • Tutorials
    • See all tutorials
    • Part I: Concrete - FHE compiler
    • Part II: The Architecture of Concrete
  • References
    • API
    • Supported operations
  • Explanations
    • Compiler workflow
    • Advanced features
      • Table Lookups advanced
      • Rounding
      • Truncating
      • Floating points
      • Comparisons
      • Min/Max operations
      • Bitwise operations
      • Direct circuits
      • Tagging
    • Cryptography basics
    • Security
    • Frontend fusing
  • Developers
    • Contributing
      • Project layout
      • Compiler backend
        • Adding a new backend
      • Optimizer
      • MLIR FHE dialects
        • FHELinalg dialect
        • FHE dialect
        • TFHE dialect
        • Concrete dialect
        • Tracing dialect
        • Runtime dialect
        • SDFG dialect
      • Call FHE circuits from other languages
      • Benchmarking
      • Examples
      • Making a release
    • Release note
    • Feature request
    • Bug report
Powered by GitBook

Libraries

  • TFHE-rs
  • Concrete
  • Concrete ML
  • fhEVM

Developers

  • Blog
  • Documentation
  • Github
  • FHE resources

Company

  • About
  • Introduction to FHE
  • Media
  • Careers
On this page

Was this helpful?

Export as PDF
  1. Guides
  2. Optimization
  3. Optimize table lookups

Reducing TLU

This guide teaches how to improve the execution time of Concrete circuits by reducing the amount of table lookups.

Reducing the amount of table lookups is probably the most complicated guide in this section as it's not automated. The idea is to use mathematical properties of operations to reduce the amount of table lookups needed to achieve the result.

One great example is in adding big integers in bitmap representation. Here is the basic implementation:

def add_bitmaps(x, y):
    result = fhe.zeros((N,))
    carry = 0

    addition = x + y
    for i in range(N):
        addition_and_carry = addition[i] + carry
        carry = addition_and_carry >> 1
        result[i] = addition_and_carry % 2

    return result

There are two table lookups within the loop body, one for >> and one for %.

This implementation is not optimal though, since the same output can be achieved with just a single table lookup:

def add_bitmaps(x, y):
    result = fhe.zeros((N,))
    carry = 0

    addition = x + y
    for i in range(N):
        addition_and_carry = addition[i] + carry
        carry = addition_and_carry >> 1
        result[i] = addition_and_carry - (carry * 2)

    return result

It was possible to do this because the original operations had a mathematical equivalence with the optimized operations and optimized operations achieved the same output with less table lookups!

Here is the full code example and some numbers for this optimization:

import numpy as np
from concrete import fhe

N = 32

def add_bitmaps_naive(x, y):
    result = fhe.zeros((N,))
    carry = 0

    addition = x + y
    for i in range(N):
        addition_and_carry = addition[i] + carry
        carry = addition_and_carry >= 2
        result[i] = addition_and_carry % 2

    return result

def add_bitmaps_optimized(x, y):
    result = fhe.zeros((N,))
    carry = 0

    addition = x + y
    for i in range(N):
        addition_and_carry = addition[i] + carry
        carry = addition_and_carry >> 1
        result[i] = addition_and_carry - (carry * 2)

    return result

inputset = fhe.inputset(fhe.tensor[fhe.uint1, N], fhe.tensor[fhe.uint1, N])
for (name, implementation) in [("naive", add_bitmaps_naive), ("optimized", add_bitmaps_optimized)]:
    compiler = fhe.Compiler(implementation, {"x": "encrypted", "y": "encrypted"})
    circuit = compiler.compile(inputset)

    print(
        f"{name:>9} implementation "
        f"-> {int(circuit.programmable_bootstrap_count)} table lookups "
        f"-> {int(circuit.complexity):_} complexity"
    )

prints:

    naive implementation -> 63 table lookups -> 2_427_170_697 complexity
optimized implementation -> 32 table lookups -> 1_224_206_208 complexity

which is almost half the amount of table lookups and ~2x less complexity for the same operation!

PreviousOptimize table lookupsNextImplementation strategies

Last updated 1 month ago

Was this helpful?