concrete.ml.quantization.base_quantized_op
Last updated
Last updated
concrete.ml.quantization.base_quantized_op
Base Quantized Op class that implements quantization for a float numpy op.
ONNX_OPS_TO_NUMPY_IMPL
ALL_QUANTIZED_OPS
ONNX_OPS_TO_QUANTIZED_IMPL
DEFAULT_MODEL_BITS
QuantizedOp
Base class for quantized ONNX ops implemented in numpy.
Args:
n_bits_output
(int): The number of bits to use for the quantization of the output
int_input_names
(Set[str]): The set of names of integer tensors that are inputs to this op
constant_inputs
(Optional[Union[Dict[str, Any], Dict[int, Any]]]): The constant tensors that are inputs to this op
input_quant_opts
(QuantizationOptions): Input quantizer options, determine the quantization that is applied to input tensors (that are not constants)
__init__
calibrate
Create corresponding QuantizedArray for the output of the activation function.
Args:
*inputs (numpy.ndarray)
: Calibration sample inputs.
Returns:
numpy.ndarray
: the output values for the provided calibration samples.
call_impl
Call self.impl to centralize mypy bug workaround.
Args:
*inputs (numpy.ndarray)
: real valued inputs.
**attrs
: the QuantizedOp attributes.
Returns:
numpy.ndarray
: return value of self.impl
can_fuse
Determine if the operator impedes graph fusion.
This function shall be overloaded by inheriting classes to test self._int_input_names, to determine whether the operation can be fused to a TLU or not. For example an operation that takes inputs produced by a unique integer tensor can be fused to a TLU. Example: f(x) = x * (x + 1) can be fused. A function that does f(x) = x * (x @ w + 1) can't be fused.
Returns:
bool
: whether this instance of the QuantizedOp produces Concrete Numpy code that can be fused to TLUs
must_quantize_input
Determine if an input must be quantized.
Quantized ops and numpy onnx ops take inputs and attributes. Inputs can be either constant or variable (encrypted). Note that this does not handle attributes, which are handled by QuantizedOp classes separately in their constructor.
Args:
input_name_or_idx
(int): Index of the input to check.
Returns:
result
(bool): Whether the input must be quantized (must be a QuantizedArray
) or if it stays as a raw numpy.array
read from ONNX.
prepare_output
Quantize the output of the activation function.
The calibrate method needs to be called with sample data before using this function.
Args:
qoutput_activation
(numpy.ndarray): Output of the activation function.
Returns:
QuantizedArray
: Quantized output.
q_impl
Execute the quantized forward.
Args:
*q_inputs (QuantizedArray)
: Quantized inputs.
**attrs
: the QuantizedOp attributes.
Returns:
QuantizedArray
: The returned quantized value.