Comment on page
Built-in Model Examples
In Concrete ML, built-in linear models are exact equivalents to their scikit-learn counterparts. As they do not apply any non-linearity during inference, these models are very fast (~1ms FHE inference time) and can use high-precision integers (between 20-25 bits).
Tree-based models apply non-linear functions that enable comparisons of inputs and trained thresholds. Thus, they are limited with respect to the number of bits used to represent the inputs. But as these examples show, in practice 5-6 bits are sufficient to exactly reproduce the behavior of their scikit-learn counterpart models.
In the examples below, built-in neural networks can be configured to work with user-specified accumulator sizes, which allow the user to adjust the speed/accuracy trade-off.
These examples show how to use the built-in linear models on synthetic data, which allows for easy visualization of the decision boundaries or trend lines. Executing these 1D and 2D models in FHE takes around 1 millisecond.
Using the OpenML spams data-set, this example shows how to train a classifier that detects spam, based on features extracted from email messages. A grid-search is performed over decision-tree hyper-parameters to find the best ones.
This example shows how to train tree-ensemble models (either XGBoost or Random Forest), first on a synthetic data-set, and then on the Diabetes data-set. Grid-search is used to find the best number of trees in the ensemble.
Privacy-preserving prediction of house prices is shown in this example, using the House Prices data-set. Using 50 trees in the ensemble, with 5 bits of precision for the input features, the FHE regressor obtains an
score of 0.90 and an execution time of 7-8 seconds.
Based on three different synthetic data-sets, all the built-in classifiers are demonstrated in this notebook, showing accuracies, inference times, accumulator bit-widths, and decision boundaries.