Concrete
WebsiteLibrariesProducts & ServicesDevelopersSupport
2.7
2.7
  • Welcome
  • Get Started
    • What is Concrete?
    • Installation
    • Quick start
    • Compatibility
    • Terminology
  • Core features
    • Overview
    • Table lookups (basics)
    • Non-linear operations
    • Advanced features
      • Bit extraction
      • Common tips
      • Extensions
  • Compilation
    • Combining compiled functions
      • With composition
      • With modules
    • Key-related options for faster execution
      • Multi precision
      • Multi parameters
    • Compression
    • Reusing arguments
    • Common errors
  • Execution / Analysis
    • Simulation
    • Debugging and artifact
    • GPU acceleration
    • Other
      • Statistics
      • Progressbar
      • Formatting and drawing
  • Guides
    • Configure
    • Manage keys
    • Deploy
  • Tutorials
    • See all tutorials
    • Part I: Concrete - FHE compiler
    • Part II: The Architecture of Concrete
  • References
    • API
  • Explanations
    • Compiler workflow
    • Compiler internals
      • Table lookups
      • Rounding
      • Truncating
      • Floating points
      • Comparisons
      • Min/Max operations
      • Bitwise operations
      • Direct circuits
      • Tagging
    • Security
    • Frontend fusing
  • Developers
    • Contributing
    • Release note
    • Feature request
    • Bug report
    • Project layout
    • Compiler backend
      • Adding a new backend
    • Optimizer
    • MLIR FHE dialects
      • FHELinalg dialect
      • FHE dialect
      • TFHE dialect
      • Concrete dialect
      • Tracing dialect
      • Runtime dialect
      • SDFG dialect
    • Call FHE circuits from other languages
Powered by GitBook

Libraries

  • TFHE-rs
  • Concrete
  • Concrete ML
  • fhEVM

Developers

  • Blog
  • Documentation
  • Github
  • FHE resources

Company

  • About
  • Introduction to FHE
  • Media
  • Careers
On this page
  • GPU execution configuration
  • SDFG_NUM_THREADS
  • SDFG_NUM_GPUS
  • SDFG_MAX_BATCH_SIZE**
  • SDFG_DEVICE_TO_CORE_RATIO
  • OMP_NUM_THREADS

Was this helpful?

Export as PDF
  1. Execution / Analysis

GPU acceleration

PreviousDebugging and artifactNextOther

Last updated 10 months ago

Was this helpful?

This document explains how to use GPU accelerations with Concrete.

Concrete supports acceleration using one or more GPUs.

This version is not available on , which only hosts wheels with CPU support.

To use GPU acceleration, install the GPU/CUDA wheel from our using the following command:

pip install concrete-python --index-url https://pypi.zama.ai/gpu.

After installing the GPU/CUDA wheel, you must the FHE program compilation to enable GPU offloading using the use_gpu option.

Our GPU wheels are built with CUDA 11.8 and should be compatible with higher versions of CUDA.

GPU execution configuration

By default the compiler and runtime will use all available system resources, including all CPU cores and GPUs. You can adjust this by using the following environment variables:

SDFG_NUM_THREADS

  • Type: Integer

  • Default value: The number of hardware threads on the system (including hyperthreading) minus the number of GPUs in use.

  • Description: This variable determines the number of CPU threads that execute in paralelle with the GPU for offloadable workloads. GPU scheduler threads (including CUDA threads and those used within Concrete) are necessary but can block or interfere with worker thread execution. Therefore, it is recommended to undersubscribe the CPU hardware threads by the number of GPU devices used.

  • Required: No

SDFG_NUM_GPUS

  • Type: Integer

  • Default value: The number of GPUs available.

  • Description: This value determines the number of GPUs to use for offloading. This can be set to any value between 1 and the total number of GPUs on the system.

  • Required: No

SDFG_MAX_BATCH_SIZE**

  • Type: Integer (default: LLONG_MAX)

  • Default value: LLONG_MAX (no batch size limit)

  • Description: This value limits the maximum batch size for offloading in cases where the GPU memory is insufficient.

  • Required: No

SDFG_DEVICE_TO_CORE_RATIO

  • Type: Integer

  • Default value: The ratio between the compute capability of the GPU (at index 0) and a CPU core.

  • Description: This ratio is used to balance the load between the CPU and GPU. If the GPU is underutilized, set this value higher to increase the amount of work offloaded to the GPU.

  • Required: No

OMP_NUM_THREADS

  • Type: Integer

  • Default value: The number of hardware threads on the system, including hyperthreading.

  • Description: This value specifies the portions of program execution that are not yet supported for GPU offload, which will be parallelized using OpenMP on the CPU.

  • Required: No

pypi.org
Zama public PyPI repository
configure