Nearest neighbors

This document introduces the nearest neighbors non-parametric classification models that Concrete ML provides with a scikit-learn interface through the KNeighborsClassifier class.

Example

from concrete.ml.sklearn import KNeighborsClassifier

concrete_classifier = KNeighborsClassifier(n_bits=2, n_neighbors=3)

Quantization parameters

The KNeighborsClassifier class quantizes the training data-set provided to .fit using the specified number of bits (n_bits). To comply with accumulator size constraints, you must keep this value low. The model's accuracy will depend significantly on a well-chosen n_bits value and the dimensionality of the data.

The predict method of the KNeighborsClassifier performs the following steps:

  1. Quantize the test vectors on clear data

  2. Compute the top-k class indices of the closest training set vector on encrypted data

  3. Vote for the top-k class labels to find the class for each test vector, performed on clear data

Inference time considerations

The FHE inference latency of this model is heavily influenced by the n_bits and the dimensionality of the data. Additionally, the data-set size has a linear impact on the data complexity. The number of nearest neighbors (n_neighbors) also affects performance.

Last updated