DML_QUANTIZED_LINEAR_CONVOLUTION_OPERATOR_DESC structure (directml.h)

Article
08/21/2024

Performs a convolution of the FilterTensor with the InputTensor. This operator performs forward convolution on quantized data. This operator is mathematically equivalent to dequantizing the inputs, convolving, and then quantizing the output.

The quantize linear functions used by this operator are the linear quantization functions

Dequantize function

f(Input, Scale, ZeroPoint) = (Input - ZeroPoint) * Scale

Quantize function

f(Input, Scale, ZeroPoint) = clamp(round(Input / Scale) + ZeroPoint, Min, Max)

Syntax

struct DML_QUANTIZED_LINEAR_CONVOLUTION_OPERATOR_DESC {
  const DML_TENSOR_DESC *InputTensor;
  const DML_TENSOR_DESC *InputScaleTensor;
  const DML_TENSOR_DESC *InputZeroPointTensor;
  const DML_TENSOR_DESC *FilterTensor;
  const DML_TENSOR_DESC *FilterScaleTensor;
  const DML_TENSOR_DESC *FilterZeroPointTensor;
  const DML_TENSOR_DESC *BiasTensor;
  const DML_TENSOR_DESC *OutputScaleTensor;
  const DML_TENSOR_DESC *OutputZeroPointTensor;
  const DML_TENSOR_DESC *OutputTensor;
  UINT                  DimensionCount;
  const UINT            *Strides;
  const UINT            *Dilations;
  const UINT            *StartPadding;
  const UINT            *EndPadding;
  UINT                  GroupCount;
};

Members

InputTensor

Type: const DML_TENSOR_DESC*

A tensor containing the input data. The expected dimensions of the InputTensor are { InputBatchCount, InputChannelCount, InputHeight, InputWidth }.

InputScaleTensor

Type: const DML_TENSOR_DESC*

A tensor containing the input scale data. The expected dimensions of the InputScaleTensor are { 1, 1, 1, 1 }. This scale value is used for dequantizing the input values.

Note

A scale value of 0 results in undefined behavior.

InputZeroPointTensor

Type: _Maybenull_ const DML_TENSOR_DESC*

An optional tensor containing the input zero point data. The expected dimensions of the InputZeroPointTensor are { 1, 1, 1, 1 }. This zero point value is used for dequantizing the input values.

FilterTensor

Type: const DML_TENSOR_DESC*

A tensor containing the filter data. The expected dimensions of the FilterTensor are { FilterBatchCount, FilterChannelCount, FilterHeight, FilterWidth }.

FilterScaleTensor

Type: const DML_TENSOR_DESC*

A tensor containing the filter scale data. The expected dimensions of the FilterScaleTensor are { 1, 1, 1, 1 } if per tensor quantization is required, or { 1, OutputChannelCount, 1, 1 } if per channel quantization is required. This scale value is used for dequantizing the filter values.

Note

A scale value of 0 results in undefined behavior.

FilterZeroPointTensor

Type: _Maybenull_ const DML_TENSOR_DESC*

An optional tensor containing the filter zero point data. The expected dimensions of the FilterZeroPointTensor are { 1, 1, 1, 1 } if per tensor quantization is required, or { 1, OutputChannelCount, 1, 1 } if per channel quantization is required. This zero point value is used for dequantizing the filter values.

BiasTensor

Type: _Maybenull_ const DML_TENSOR_DESC*

A tensor containing the bias data. The bias tensor is a tensor containing data which is broadcasted across the output tensor at the end of the convolution which is added to the result. The expected dimensions of the BiasTensor are { 1, OutputChannelCount, 1, 1 } for 4D.

OutputScaleTensor

Type: const DML_TENSOR_DESC*

A tensor containing the output scale data. The expected dimensions of the OutputScaleTensor are { 1, 1, 1, 1 }. This input scale value is used for quantizing the convolution output values.

Note

A scale value of 0 results in undefined behavior.

OutputZeroPointTensor

Type: _Maybenull_ const DML_TENSOR_DESC*

An optional tensor containing the filter zero point data. The expected dimensions of the OutputZeroPointTensor are { 1, 1, 1, 1 }. This input zero point value is used for quantizing the convolution the output values.

OutputTensor

Type: const DML_TENSOR_DESC*

A tensor to write the results to. The expected dimensions of the OutputTensor are { OutputBatchCount, OutputChannelCount, OutputHeight, OutputWidth }.

DimensionCount

Type: UINT

The number of spatial dimensions for the convolution operation. Spatial dimensions are the lower dimensions of the convolution filter tensor FilterTensor. This value also determines the size of the Strides, Dilations, StartPadding, and EndPadding arrays. Only a value of 2 is supported.

Strides

Type: _Field_size_(DimensionCount) const UINT*

The strides of the convolution operation. These strides are applied to the convolution filter. They are separate from the tensor strides included in DML_TENSOR_DESC.

Dilations

Type: _Field_size_(DimensionCount) const UINT*

The Dilations of the convolution operation. Dilations are strides applied to the elements of the filter kernel. This has the effect of simulating a larger filter kernel by padding the internal filter kernel elements with zeros.

StartPadding

Type: _Field_size_(DimensionCount) const UINT*

The padding values to be applied to the beginning of each spatial dimension of the filter and input tensor of the convolution operation.

EndPadding

Type: _Field_size_(DimensionCount) const UINT*

The padding values to be applied to the end of each spatial dimension of the filter and input tensor of the convolution operation.

GroupCount

Type: UINT

The number of groups which to divide the convolution operation into. GroupCount can be used to achieve depth-wise convolution by setting the GroupCount equal to the input channel count. This divides the convolution up into a separate convolution per input channel.

Availability

This operator was introduced in DML_FEATURE_LEVEL_2_1.

Tensor constraints

BiasTensor, FilterTensor, InputTensor, and OutputTensor must have the same DimensionCount.
OutputTensor and OutputZeroPointTensor must have the same DataType.
InputTensor and InputZeroPointTensor must have the same DataType.
FilterTensor and FilterZeroPointTensor must have the same DataType.

Tensor support

DML_FEATURE_LEVEL_5_2 and above

Tensor	Kind	Supported dimension counts	Supported data types
InputTensor	Input	3 to 4	INT8, UINT8
InputScaleTensor	Input	1 to 4	FLOAT32
InputZeroPointTensor	Optional input	1 to 4	INT8, UINT8
FilterTensor	Input	3 to 4	INT8, UINT8
FilterScaleTensor	Input	1 to 4	FLOAT32
FilterZeroPointTensor	Optional input	1 to 4	INT8, UINT8
BiasTensor	Optional input	3 to 4	INT32
OutputScaleTensor	Input	1 to 4	FLOAT32
OutputZeroPointTensor	Optional input	1 to 4	INT8, UINT8
OutputTensor	Output	3 to 4	INT8, UINT8

DML_FEATURE_LEVEL_4_0 and above

Tensor	Kind	Supported dimension counts	Supported data types
InputTensor	Input	3 to 4	INT8, UINT8
InputScaleTensor	Input	1 to 4	FLOAT32
InputZeroPointTensor	Optional input	1 to 4	INT8, UINT8
FilterTensor	Input	3 to 4	INT8, UINT8
FilterScaleTensor	Input	3 to 4	FLOAT32
FilterZeroPointTensor	Optional input	1 to 4	INT8, UINT8
BiasTensor	Optional input	3 to 4	INT32
OutputScaleTensor	Input	1 to 4	FLOAT32
OutputZeroPointTensor	Optional input	1 to 4	INT8, UINT8
OutputTensor	Output	3 to 4	INT8, UINT8

DML_FEATURE_LEVEL_2_1 and above

Tensor	Kind	Supported dimension counts	Supported data types
InputTensor	Input	4	INT8, UINT8
InputScaleTensor	Input	4	FLOAT32
InputZeroPointTensor	Optional input	4	INT8, UINT8
FilterTensor	Input	4	INT8, UINT8
FilterScaleTensor	Input	4	FLOAT32
FilterZeroPointTensor	Optional input	4	INT8, UINT8
BiasTensor	Optional input	4	INT32
OutputScaleTensor	Input	4	FLOAT32
OutputZeroPointTensor	Optional input	4	INT8, UINT8
OutputTensor	Output	4	INT8, UINT8

Requirements

Requirement	Value
Minimum supported client	Windows 10 Build 20348
Minimum supported server	Windows 10 Build 20348
Header	directml.h

Share via

DML_QUANTIZED_LINEAR_CONVOLUTION_OPERATOR_DESC structure (directml.h)

Dequantize function

Quantize function

Syntax

Members

Availability

Tensor constraints

Tensor support

DML_FEATURE_LEVEL_5_2 and above

DML_FEATURE_LEVEL_4_0 and above

DML_FEATURE_LEVEL_2_1 and above

Requirements

Feedback

Additional resources