Concrete-ML provides partial support for Pandas, with most available models (linear and tree-based models) usable on Pandas dataframes just as they would be used with NumPy arrays.
The table below summarizes current compatibility:
Methods
Support Pandas dataframe
fit
✓
compile
✗
predict (execute_in_fhe=False)
✓
predict (execute_in_fhe=True)
✓
Example
The following example considers a LogisticRegression model on a simple classification problem. A more advanced example can be found in the Titanic use case notebook, which considers a XGBClassifier.
import numpy as npimport pandas as pdfrom concrete.ml.sklearn import LogisticRegressionfrom sklearn.datasets import make_classificationfrom sklearn.model_selection import train_test_split# Create the data set as a Pandas dataframeX, y =make_classification( n_samples=250, n_features=30, n_redundant=0, random_state=2,)X, y = pd.DataFrame(X), pd.DataFrame(y)# Retrieve train and test setsX_train, X_test, y_train, y_test =train_test_split(X, y, test_size=0.4, random_state=42)# Instantiate the modelmodel =LogisticRegression(n_bits=8)# Fit the modelmodel.fit(X_train, y_train)# Evaluate the model on the test set in cleary_pred_clear = model.predict(X_test)# Compile the modelmodel.compile(X_train.to_numpy())# Perform the inference in FHEy_pred_fhe = model.predict(X_test, execute_in_fhe=True)# Assert that FHE predictions are the same as the clear predictionsprint( f"{(y_pred_fhe == y_pred_clear).sum()} " f"examples over {len(y_pred_fhe)} have a FHE inference equal to the clear inference.")# Output:# 100 examples over 100 have a FHE inference equal to the clear inference.