DML_ROI_POOLING_OPERATOR_DESC structure (directml.h)
Performs a MaxPool function across the input tensor (according to regions of interest, or ROIs). For each output element, the coordinates of its corresponding ROI in the input are computed by the following equations.
Let Y
be an index into the third dimension of InputTensor ({ BatchSize, ChannelCount, **height**, width }
).
Let X
be an index into the fourth dimension of InputTensor ({ BatchSize, ChannelCount, height, **width** }
).
x1 = round(RoiX1 * SpatialScale)
x2 = round(RoiX2 * SpatialScale)
y1 = round(RoiY1 * SpatialScale)
y2 = round(RoiY2 * SpatialScale)
RegionHeight = y2 - y1 + 1
RegionWidth = x2 - x1 + 1
StartY = (OutputIndices.Y * RegionHeight) / PooledSize.Height + y1
StartX = (OutputIndices.X * RegionWidth) / PooledSize.Width + x1
EndY = ((OutputIndices.Y + 1) * RegionHeight + PooledSize.Height - 1) / PooledSize.Height + y1
EndX = ((OutputIndices.X + 1) * RegionWidth + PooledSize.Width - 1) / PooledSize.Width + x1
If the computed coordinates are out of bounds, then they are clamped to the input boundaries.
Syntax
struct DML_ROI_POOLING_OPERATOR_DESC {
const DML_TENSOR_DESC *InputTensor;
const DML_TENSOR_DESC *ROITensor;
const DML_TENSOR_DESC *OutputTensor;
FLOAT SpatialScale;
DML_SIZE_2D PooledSize;
};
Members
InputTensor
Type: const DML_TENSOR_DESC*
A tensor containing the input data with dimensions { BatchCount, ChannelCount, InputHeight, InputWidth }
.
ROITensor
Type: const DML_TENSOR_DESC*
A tensor containing the regions of interest (ROI) data. The expected dimensions of ROITensor
are { 1, 1, NumROIs, 5 }
and the data for each ROI is [BatchID, x1, y1, x2, y2]
. x1, y1, x2, y2 are the inclusive coordinates of the corners of each ROI and x2 >= x1, y2 >= y1.
OutputTensor
Type: const DML_TENSOR_DESC*
A tensor containing the output data. The expected dimensions of OutputTensor are { NumROIs, InputChannelCount, PooledSize.Height, PooledSize.Width }
.
SpatialScale
Type: FLOAT
Multiplicative spatial scale factor used to translate the ROI coordinates from their input scale to the scale used when pooling.
PooledSize
Type: DML_SIZE_2D
The ROI pool output size (height, width), which must match the last 2 dimensions of OutputTensor.
Availability
This operator was introduced in DML_FEATURE_LEVEL_1_0
.
Tensor constraints
InputTensor, OutputTensor, and ROITensor must have the same DataType.
Tensor support
Tensor | Kind | Supported dimension counts | Supported data types |
---|---|---|---|
InputTensor | Input | 4 | FLOAT32, FLOAT16 |
ROITensor | Input | 4 | FLOAT32, FLOAT16 |
OutputTensor | Output | 4 | FLOAT32, FLOAT16 |
Requirements
Requirement | Value |
---|---|
Header | directml.h |