ABCDEFGHIJKLMNOPQRSTUVWXYZAAABAC
1
NNAPIMPSDirectMLBNNSDNNL (MKL-DNN)clDNNONNX
2
Tensor TypeFloat32ANEURALNETWORKS_TENSOR_FLOAT32MPSImageFeatureChannelFormatFloat32 (MPS uses float 16 internally)DML_TENSOR_DATA_TYPE_FLOAT32BNNSDataTypeFloat32memory::data_type:f32cldnn_f32float32 (Tensor Element Types)
3
Float16ANEURALNETWORKS_TENSOR_FLOAT16MPSImageFeatureChannelFormatFloat16DML_TENSOR_DATA_TYPE_FLOAT16BNNSDataTypeFloat16memory::data_type:f16cldnn_f16float16 (Tensor Element Types)
4
Quantized Int8ANEURALNETWORKS_TENSOR_QUANT8_ASYMM
real_value = (integer_value - zeroPoint) * scale
[Note]: not supportedDML_TENSOR_DATA_TYPE_UINT8[Note]: not supportedmemory::data_type:s8[Note]: not supported[Note]: not mentioned
5
6
ConvolutionopANEURALNETWORKS_CONV_2D MPSCNNConvolutionNode +
MPSCNNConvolutionDescriptor
DML_OPERATOR_CONVOLUTION+
DML_CONVOLUTION_OPERATOR_DESC
BNNSFilterCreateConvolutionLayerconvolution_forward cldnn_convolution_descConv
7
input4-D tensor (NHWC)
input[0]: [batches, height, width, depth_in]
4-D tensor (NHWC)
MPSImageDescriptor.{numberOfImages, featureChannels, width, height, MPSImageFeatureChannelFormatFloat16}
sourceImage of MPSNNGraph.encode
4-D tensor (NCHW)
DML_CONVOLUTION_OPERATOR_DESC.InputTensor {DML_TENSOR_DATA_TYPE, DML_TENSOR_FLAG_NONE, dimensions, strides}
4-D tensor (NCHW)
BNNSImageStackDescription: P(c,x,y) at position(x,y) in channel c is stored in data[x + row_stride * y + image_stride * c]
row_stride = width
image_stride = row_stride * height
4-D tensor (NCHW)
src_desc of convolution_forward::desc
4-D tensor (NHWC or NCHW)
cldnn_convolution_desc.input
4-D tensor (NCHW)
Inputs["X"] : [batches, channels, height, width]
[Note]: also support tensor > 4D with [N, C, D1, D2, ..., Dn]
8
filterinput[1]: [depth_out, filter_height, filter_width, depth_in]weight[ outputChannels ][ kernelHeight ][ kernelWidth ][ inputChannels / groups ]
MPSCNNConvolutionDescriptor.{kernelWidth, kernelHeight, inputFeatureChannels, outputFeatureChannels}
MPSCNNConvolutionDataSource.weights
4-D tensor [depth_out, depth_in, filter_height, filter_width]
DML_CONVOLUTION_OPERATOR_DESC.FilterTensor {DML_TENSOR_DATA_TYPE, DML_TENSOR_FLAG_OWNED_BY_DML, dimensions, strides}
4-D tensor [depth_out, depth_in, filter_height, filter_width]
BNNSConvolutionLayerParameters.{in_channels, out_channels, k_width, k_height}
Weight (o,i,kx,ky) for output (out_channels), input (in_channels)
the kernel point (kx, ky) is stored in weights[kx + k_width * (ky + k_height * (i + in_channles * o))]
4-D tensor [depth_out, depth_in, filter_height, filter_width]
weights_desc of convolution_forward::desc
cldnn_convolution_desc.weightsInputs["W"]: [M, C/group, kH, kW], where C is the number of channels, and kH and kW are the height and width of the kernel, and M is the number of feature maps. For common convolution, the group is 1
[Note]: also support tensor > 4D with size (M x C/group x k1 x k2 x ... x kn)
9
biasinput[2]: [depth_out]MPSCNNConvolutionDataSource.biasTerms4-D tensor [1, depth_out, 1, 1]
DML_CONVOLUTION_OPERATOR_DESC.BiasTensor {DML_TENSOR_DATA_TYPE, DML_TENSOR_FLAG_OWNED_BY_DML, dimensions, strides}
BNNSLayer BNNSConvolutionLayerParameters.biasbias_desc of of convolution_forward::desccldnn_convolution_desc.biasInputs["B"]: 1D tensor [M], M is the number of feature maps
10
paddingexplicit padding:
input[3:6]: left, right, top, bottom
implicit padding:
input[3]: SAME, VALID
MPSCNNConvolutionNode.offset.{x, y}
[Note]: right and bottom of explicit padding is not supported?
[Note]: implicit padding is not supported?
DML_CONVOLUTION_OPERATOR_DESC.StartPadding {padding_top, padding_left}
DML_CONVOLUTION_OPERATOR_DESC.EndPadding {padding_bottom, padding_right}
[Note]: implicit padding is not supported?
BNNSConvolutionLayerParameters.x_padding
BNNSConvolutionLayerParameters.y_padding
[Note]: only support same left and right paddings, and same top and bottom paddings
[Note]: implicit padding is not supported
padding_l of of convolution_forward::desc: [top, left]
padding_r of convolution_forward::desc: [bottom, right]
[Note]: implicit padding is not supported
cldnn_convolution_desc.input_offset
[Note]: implicit padding is not supported
explicit padding:
Attributes["pads"]: [x1_begin, x2_begin...x1_end, x2_end,...], where xi_begin the number of pixels added at the beginning of axis `i` and xi_end, the number of pixels added at the end of axis `i`, for 2D image, it's [top, bottom, left, right]
implicit padding:
Attribute["auto_pad"]: NOTSET, SAME_UPPER, SAME_LOWER or VALID
[Note]: implicit padding's SAME_UPPER or SAME_LOWER are not mapped
[Note]: implicit padding has DEPRECATION NOTE
11
strideinput[4:5]: stride_width, stride_heightMPSCNNConvolutionDescriptor.strideInPixelsX
MPSCNNConvolutionDescriptor.strideInPixelsY
DML_CONVOLUTION_OPERATOR_DESC.Strides {stride_width, stride_height}BNNSConvolutionLayerParameters.x_stride
BNNSConvolutionLayerParameters.y_stride
strides of convolution_forward::desc: [stride_width, stride_height]cldnn_convolution_desc.strideAttributes["strides"]: list of ints, stride along each axis, for 2D image, stride_height, stride_width
12
fused activation
input[6]: RELU, RELU1, RELU6MPSCNNConvolutionDescriptor.neuron
MPSCNNNeuronReLU
MPSCNNNeuronReLUN
[Note]: RELU1 min(1.f, max(-1.f, input)) not supported
DML_OPERATOR_ACTIVATION_RELU
DML_OPERATOR_ELEMENT_WISE_CLIP {-1, 1}
DML_OPERATOR_ELEMENT_WISE_CLIP {0, 6}
BNNSActivationFunctionRectifiedLinear
BNNSActivationFunctionClamp: {min(max(x, alpha), beta)}
post_ops::append_eltwise:
eltwise_relu
eltwise_bounded_relu
cldnn_convolution_desc.with_activation
[Note]: fused RELU1 and RELU6 are not supported
[Note]: not mentoined, might not be needed as IR level
13
dilation rateinput[11:12]: dilation_width, dilation_heightMPSCNNConvolutionDescriptor.{dilationRateX, dilationRateY}DML_CONVOLUTION_OPERATOR_DESC.Dilations {dilation_width, dilation_height}[Note]: not supporteddilates of convolution_forward::desc: [dilation_width, dilation_height]cldnn_convolution_desc.dilationAttributes["dilations"]: list of ints, dilation value along each axis of the filter
14
output4-D tensor (NHWC)
output[0]: [batches, out_height, out_width, depth_out]
4-D tensor (NHWC)
MPSImageDescriptor.{numberOfImages, featureChannels, width, height}
destinationImage of MPSNNGraph.encode
4-D tensor (NCHW)
DML_CONVOLUTION_OPERATOR_DESC.OutputTensor {DML_TENSOR_DATA_TYPE, DML_TENSOR_FLAG_NONE, dimensions, strides}
4-D tensor (NCHW)
BNNSImageStackDescription:
P(c,x,y) at position(x,y) in channel c is stored in data[x + row_stride * y + image_stride * c]
row_stride = width
image_stride = row_stride * height
4-D tensor (NCHW)
dst_desc of convolution_forward::desc
4-D tensor (HNWC or NCHW)
cldnn_convolution_desc.with_output_size
cldnn_convolution_desc.output_size
Output tensor of primitive
Outputs["Y"]: for 4-D tensor (NCHW)
15
16
Depthwise Convolution
(addtional to convolution)
opANEURALNETWORKS_DEPTHWISE_CONV_2D MPSCNNConvolutionNode +
MPSCNNDepthWiseConvolutionDescriptor
DML_OPERATOR_CONVOLUTION+
DML_CONVOLUTION_OPERATOR_DESC
Set DML_CONVOLUTION_OPERATOR_DESC.GroupCount to in_channels
[Note]: not supportedconvolution_forward with weights_format = dnnl_hwigo and group size = number of filtersSet cldnn_convolution_desc.split as depth_outUse same Conv op with attributes["group"] equals to in_channels
[Note]: not well documented in onnx doc, hints from https://github.com/onnx/tensorflow-onnx/blob/master/tf2onnx/tfonnx.py#L519
17

depthwise multiplier
input[6]: depthwise multiplierMPSCNNDepthWiseConvolutionDescriptor.channelMultiplier[Note] only tested multiplier as 1[Note]: not supported[note] only multipier 1 is supported[Note]: only support multiplier as 1out_channels = K * in_channels, where depthwise multiplier K
[Note]: not well documented in onnx doc, hints from https://github.com/onnx/tensorflow-onnx/blob/master/tf2onnx/tfonnx.py#L519
18
19
opANEURALNETWORKS_AVERAGE_POOL_2DMPSCNNPoolingAverageNodeDML_OPERATOR_AVERAGE_POOLING +
DML_AVERAGE_POOLING_OPERATOR_DESC
BNNSFilterCreatePoolingLayer
BNNSPoolingLayerParameters.BNNSPoolingFunction = BNNSPoolingAverage
dnnl_pooling_forward
alg kind = dnnl_pooling_avg
cldnn_pooling_desc.mode = cldnn_pooling_averageAveragePool
20
Average Poolinginput4-D tensor (NHWC)
input[0]: [batches, height, width, depth_in]
4-D tensor (NHWC)
MPSImageDescriptor.{numberOfImages, featureChannels, width, height}
sourceImage of MPSNNGraph.encode
4-D tensor (NCHW)
DML_AVERAGE_POOLING_OPERATOR_DESC.InputTensor {DML_TENSOR_DATA_TYPE, DML_TENSOR_FLAG_NONE, dimensions, strides}
4-D tensor (NCWH)
BNNSImageStackDescription:
P(c,x,y) at position(x,y) in channel c is stored in data[x + row_stride * y + image_stride * c]
row_stride = width
image_stride = row_stride * height
4-D tensor (NCHW)
src_desc of pooling_forward::desc
4-D tensor (NHWC or NCHW)
cldnn_pooling_desc.input
4-D tensor (NCHW)
Inputs["X"]: [batches, channels, height, width]
[Note]: also support tensor > 4D with [N, C, D1, D2, ..., Dn]
21
paddingexplicit padding:
input[1:4]: left, right, top, bottom
implicit padding:
input[1]: SAME, VALID
MPSCNNPoolingAverageNode.offset.{x, y}
[Note]: right and bottom of explicit padding is not supported
[Note]: implicit padding is not supported
DML_AVERAGE_POOLING_OPERATOR_DESC.StartPadding {padding_top, padding_left}
DML_AVERAGE_POOLING_OPERATOR_DESC.EndPadding {padding_bottom, padding_right}
BNNSPoolingLayerParameters.x_padding
BNNSPoolingLayerParameters.y_padding
[Note]: right and bottom of explicit padding is not supported
[Note]: implicit padding is not supported
padding_l of of pooling_forward::desc: [top, left]
padding_r of pooling_forward::desc: [bottom, right]
[Note]: implicit padding is not supported
cldnn_pooling_desc.input_offset
[Note]: implicit padding is not supported
"explicit padding:
Attributes["pads"]: [x1_begin, x2_begin...x1_end, x2_end,...], where xi_begin the number of pixels added at the beginning of axis `i` and xi_end, the number of pixels added at the end of axis `i`, for 2D image, it's [top, bottom, left, right]
implicit padding:
Attributes["auto_pad"]: NOTSET, SAME_UPPER, SAME_LOWER or VALID
[Note]: implicit padding's SAME_UPPER or SAME_LOWER are not mapped
[Note]: implicit padding has DEPRECATION NOTE
22
strideinput[2:3]: stride_width, stride_heightMPSCNNPoolingAverageNode.{strideInPixelsX, strideInPixelsY}DML_AVERAGE_POOLING_OPERATOR_DESC.Strides {stride_width, stride_height}BNNSPoolingLayerParameters.x_stride
BNNSPoolingLayerParameters.y_stride
strides of pooling_forward::desc: [stride_width, stride_height]cldnn_pooling_desc.strideAttributes["strides"] : list of ints, stride along each axis, for 2D image, stride_height, stride_width
23
filter sizeinput[4:5]: filter_width, filter_heightMPSCNNPoolingAverageNode.{kernelWidth, kernelHeight}DML_AVERAGE_POOLING_OPERATOR_DESC.WindowSize {filter_width, filter_height}BNNSPoolingLayerParameters.k_width,
BNNSPoolingLayerParameters.k_height
2D tensor [filter_width, filter_height] kernel of convolution_forward::desc
cldnn_pooling_desc.sizeAttributes["kernel_shape"]: list of ints, the size of the kernel along each axis, for 2D image, filter_height, filter_width
24
fused activation
input[6]: RELU, RELU1, RELU6[Note]: not supportedDML_OPERATOR_ACTIVATION_RELU
DML_OPERATOR_ELEMENT_WISE_CLIP {-1, 1}
DML_OPERATOR_ELEMENT_WISE_CLIP {0, 6}
BNNSActivationFunctionRectifiedLinear
BNNSActivationFunctionClamp: {min(max(x, alpha), beta)}
post_ops::append_eltwise:
eltwise_relu
eltwise_bounded_relu
[Note]: not supported[Note]: not mentoined, might not be needed as IR level
25
output4-D tensor (NHWC)
output[0]: [batches, out_height, out_width, depth]
4-D tensor (NHWC)
MPSImageDescriptor.{numberOfImages, featureChannels, width, height}
destinationImage of MPSNNGraph.encode
4-D tensor (NCHW)
DML_AVERAGE_POOLING_OPERATOR_DESC.OutputTensor {DML_TENSOR_DATA_TYPE, DML_TENSOR_FLAG_NONE, dimensions, strides}
4-D tensor(NCWH)
BNNSImageStackDescription:
P(c,x,y) at position(x,y) in channel c is stored in data[x + row_stride * y + image_stride * c]
row_stride = width
image_stride = row_stride * height
4-D tensor (NCHW) dst_desc of pooling_forward::desc
4-D tensor (HNWC or NCHW)
cldnn_pooling_desc.with_output_size
cldnn_pooling_desc.output_size
Output tensor of primitive
Outputs["Y"]: for 4-D tensor (NCHW)
26
27
Max PoolingopANEURALNETWORKS_MAX_POOL_2DMPSCNNPoolingMaxNodeDML_OPERATOR_MAX_POOLING +
DML_MAX_POOLING_OPERATOR_DESC
BNNSFilterCreatePoolingLayer
BNNSPoolingLayerParameters.BNNSPoolingFunction = BNNSPoolingMax
dnnl_pooling_forward
alg kind = dnnl_pooling_max
cldnn_pooling_desc.mode = cldnn_pooling_maxMaxPool
28
[Note]: other parameters are same as Average Pooling 2D
[Note]: Attributes["storage_order"] is not mapped (issue: https://github.com/onnx/onnx/issues/1370)
[Note]: Optional Outputs["Indices"] is not mapped
29
30
SoftmaxopANEURALNETWORKS_SOFTMAX MPSCNNSoftMaxNodeDML_OPERATOR_ACTIVATION_SOFTMAX +
DML_ACTIVATION_SOFTMAX_OPERATOR_DESC
BNNSFilterCreateVectorActivationLayer
BNNSActivation = BNNSActivationFunctionSoftmax
dnnl_softmax_forward_desccldnn_softmax_descSoftmax
31
inputinput[0]: 2-D or 4-D tensor (NHWC)sourceImage: 2-D or 4-D tensor (NHWC)4-D tensor (NCHW)
DML_ACTIVATION_SOFTMAX_OPERATOR_DESC.InputTensor {DML_TENSOR_DATA_TYPE, DML_TENSOR_FLAG_NONE, dimensions, strides}
BNNSVectorDescription { size, data_type, data_scale, data_bias}
size represenrs vector dimension
data_desc of dnnl_softmax_forward_desc
softmax_axis axis over which softmax is computed
Up to 4-D tensor (HNWC or NCHW)
cldnn_softmax_desc.with_output_size
cldnn_softmax_desc.output_size
Inputs["input"]: 2-D tensor [batch_size, input_feature_dimensions]
[Note]: document says input "X" (issue: https://github.com/onnx/onnx/issues/1369) does not need to explicitly be a 2D vector; rather, it will be coerced into one. Attributes["axis"] describes the axis of the inputs when coerced to 2D
32
betainput[1]: float32 value[Note]: only support beta as 1.0[Note]: only support beta as 1.0BNNSActivation.beta[Note]: only support beta as 1.0[Note]: only support beta as 1.0[Note]: not supported
33
outputoutput[0]: output tensor of same shape as input0destinationImage: NHWC4-D tensor (NCHW)
DML_ACTIVATION_SOFTMAX_OPERATOR_DESC.OutputTensor {DML_TENSOR_DATA_TYPE, DML_TENSOR_FLAG_NONE, dimensions, strides}
BNNSVectorDescription { size, data_type, data_scale, data_bias}
size represenrs vector dimension
data_desc of dnnl_softmax_forward_desc
softmax_axis axis over which softmax is computed
Output tensor of primitiveOuputs["output"]: the output values with the same shape as input tensor
34
35
Element-wise AddopANEURALNETWORKS_ADDMPSCNNAddNodeDML_OPERATOR_ELEMENT_WISE_ADD + DML_ELEMENT_WISE_ADD_OPERATOR_DESCvDSP_vadddnnl_sum_primitive_desccldnn_eltwise_desc.mode = cldnn_eltwise_sumAdd
36
inputinput[0, 1]: tensor 0 and 1, up to 4-DMPSImageDescriptor.{numberOfImages, featureChannels, width, height}
primaryImage of MPSNNGraph.encode, up to 4-D
secondaryImage of MPSNNGraph.encode, up to 4-D
4-D tensor (NCHW)
DML_ELEMENT_WISE_ADD_OPERATOR_DESC.InputTensor {DML_TENSOR_DATA_TYPE, DML_TENSOR_FLAG_NONE, dimensions, strides}
src[0](dnnl_query_src_md.0)
src[1](dnnl_query_src_md,1)
...
src[n-1](dnnl_query_src_md,n-1)
4-D tensor (HNWC or NCHW)
cldnn_eltwise_desc.input (size = 2)
Inputs["A"] and Inputs["B"]: tensor
[Note]: no limitation of tensor dimension
37
scale[Note]: not supportedMPSCNNArithmeticNode.{primaryScale, secondaryScale}[Note]: not supportedscales of sum_primitive_desc- vector pf scales tu multiply data in each source memory by[Note]: not supported[Note]: not suported
38
bias[Note]: not supportedMPSCNNArithmeticNode.bias[Note]: not supported[Note]: not supported[Note]: not supported[Note]: not suported
39
fused activation
input[2]: RELU, RELU1, RELU6[Note]: not supportedDML_OPERATOR_ACTIVATION_RELU
DML_OPERATOR_ELEMENT_WISE_CLIP {-1, 1}
DML_OPERATOR_ELEMENT_WISE_CLIP {0, 6}
[Note]: not supportedcldnn_eltwise_desc.with_activation
[Note]: fused RELU1 and RELU6 are not supported
[Note]: not mentoined, might not be needed as IR level
40
outputoutput[0]: a tensorMPSImageDescriptor.{numberOfImages, featureChannels, width, height}
destinationImage of MPSNNGraph.encode
4-D tensor (NCHW)
DML_ELEMENT_WISE_ADD_OPERATOR_DESC.OutputTensor {DML_TENSOR_DATA_TYPE, DML_TENSOR_FLAG_NONE, dimensions, strides}
dst_desc of sum_primitive_desc Output tensor of primitiveOutputs["C"]: tensor
41
42
Element-wise MultiplyopANEURALNETWORKS_MULMPSNNMultiplicationNodeDML_OPERATOR_ELEMENT_WISE_MULTIPLY + DML_ELEMENT_WISE_MULTIPLY_OPERATOR_DESCvDSP_vmul[Note]: not supportedcldnn_eltwise_desc.mode = cldnn_eltwise_prodMul
43
inputinput[0, 1]: tensor 0 and 1, up to 4-DMPSImageDescriptor.{numberOfImages, featureChannels, width, height}
primaryImage of MPSNNGraph.encode, up to 4-D
secondaryImage of MPSNNGraph.encode, up to 4-D
4-D tensor (NCHW)
DML_ELEMENT_WISE_MULTIPLY_OPERATOR_DESC.InputTensor {DML_TENSOR_DATA_TYPE, DML_TENSOR_FLAG_NONE, dimensions, strides}
4-D tensor (HNWC or NCHW)
cldnn_eltwise_desc.input (size = 2)
Inputs["A"] and Inputs["B"]: tensor
[Note]: no limitation of tensor dimension
44
scale[Note]: not supportedMPSCNNArithmeticNode.{primaryScale, secondaryScale}[Note]: not supported[Note]: not supported[Note]: not suported
45
bias[Note]: not supportedMPSCNNArithmeticNode.bias[Note]: not supported[Note]: not supported[Note]: not suported
46
fused activation
input[2]: RELU, RELU1, RELU6[Note]: not supportedDML_OPERATOR_ACTIVATION_RELU
DML_OPERATOR_ELEMENT_WISE_CLIP {-1, 1}
DML_OPERATOR_ELEMENT_WISE_CLIP {0, 6}
cldnn_eltwise_desc.with_activation
[Note]: fused RELU1 and RELU6 are not supported
[Note]: not mentoined, might not be needed as IR level
47
outputoutput[0]: a tensorMPSImageDescriptor.{numberOfImages, featureChannels, width, height}
destinationImage of MPSNNGraph.encode
4-D tensor (NCHW)
DML_ELEMENT_WISE_MULTIPLY_OPERATOR_DESC.OutputTensor {DML_TENSOR_DATA_TYPE, DML_TENSOR_FLAG_NONE, dimensions, strides}
Output tensor of primitiveOutputs["C"]: tensor
48
49
ConcatenationopANEURALNETWORKS_CONCATENATIONMPSNNConcatenationNodeDML_OPERATOR_JOIN + DML_JOIN_OPERATOR_DESC[Note]: not supported
The pixel stored in BNNS is by the channel. When the concatenation axis is 4, it means concatenation by channel. So as a workaround, memcpy is used for concatenation.
dnnl_concat_primitive_desccldnn_concatenation_descConcat
50
inputsinput[0 ~ n-1]: The list of n input tensorsMPSNNImageNode of input tensorsDML_JOIN_OPERATOR_DESC.InputTensors
DML_JOIN_OPERATOR_DESC.InputCount
src[0](dnnl_query_src_md.0)
src[1](dnnl_query_src_md,1)
...
src[n-1](dnnl_query_src_md,n-1)
cldnn_concatenation_desc.input (size 2 - n)Inputs["inputs"]: list of tensors
51
axisinput[n]: specifying the concatenation axis[Note]: only support concatenation along depth channelDML_JOIN_OPERATOR_DESC.Axisconcat_dimension cldnn_concatenation_desc.axisAttributes["axis"]: which axis to concat on
52
outputoutput[0]: the output tensor4-D tensor (NHWC)
MPSImageDescriptor.{numberOfImages, featureChannels, width, height}
destinationImage of MPSNNGraph.encode
4-D tensor (NCHW)
DML_JOIN_OPERATOR_DESC.OutputTensor {DML_TENSOR_DATA_TYPE, DML_TENSOR_FLAG_NONE, dimensions, strides}
dst(dnnl_query_dst_md, 0)Output tensor of primitiveOutputs["concat_result"]: concatenated tensor
53
54
ReshapeopANEURALNETWORKS_RESHAPEMPSNNReshapeDML_OPERATOR_CAST + DML_CAST_OPERATOR_DESC[Note]: not supporteddnnl_reorder_primitive_desccldnn_reshape_descReshape
55
inputtensor, up to 4-D
input[0]
MPSImageDescriptor.{numberOfImages, featureChannels, width, height}
sourceImage of MPSNNGraph.encode, up to 4-D
4-D tensor (NCHW)
DML_CAST_OPERATOR_DESC.InputTensor {DML_TENSOR_DATA_TYPE, DML_TENSOR_FLAG_NONE, dimensions, strides}
src_desc of reorder_primitive_desc
4-D tensor {N,C,H,W}
cldnn_reshape_desc.inputInputs["data"]: tensor
56
output shapeinput[1]: A 1-D tensor of int32 defining the shape of the output tensor. [Note]: no need, output shape specificed by destination image[Note]: no need, output shape specificed by destination tensor[Note]: no need, output shape specificed by destination imagecldnn_reshape_desc.output_shapeInputs["shape"]: tensor(int64)
57
outputoutput[0]: The output tensor, of shape specified by the input shape.MPSImageDescriptor.{numberOfImages, featureChannels, width, height}
destinationImage of MPSNNGraph.encode, up to 4-D
4-D tensor (NCHW)
DML_CAST_OPERATOR_DESC.OutputTensor {DML_TENSOR_DATA_TYPE, DML_TENSOR_FLAG_NONE, dimensions, strides}
dst_desc of reorder_primitive_descOutput tensor of primitiveOutputs["reshaped"]: reshaped tensor
58
4-D tensor{NCWH}
59
Fully ConnectedopANEURALNETWORKS_FULLY_CONNECTEDMPSCNNFullyConnectedNodeDML_OPERATOR_GEMM + DML_GEMM_OPERATOR_DESCBNNSFilterCreateFullyConnectedLayerdnnl_inner_product_forward_desccldnn_fully_connected_desc
60
inputinput[0]: a tensor of at least rank 2Reshape to MPSImageDescriptor.{1, 1
product(dimensions) / weights[1], weights[1]}
Reshape {n, c, h, w} to {1, 1, input_batch_size, input_size}

dimensions = {1, 1, input_batch_size, input_size}
4-D tensor (NCHW)
DML_GEMM_OPERATOR_DESC.InputTensor {DML_TENSOR_DATA_TYPE, DML_TENSOR_FLAG_NONE, dimensions, strides}"
BNNSVectorDescriptor: Represents a vector of dimension size.
Each vector element is a scalar value, stored using the type specified in data_type.
input logical order is nc {input_batch_size, input_size)cldnn_fully_connected_desc.input (reshape to 2D tensor)
61
weightsinput[1]: a 2-D tensorMPSCNNConvolutionDescriptor.{1, 1, weights[0], weights[1]}
MPSCNNConvolutionDataSource.weights
4-D tensor (NCHW)
DML_GEMM_OPERATOR_DESC.BTensor {DML_TENSOR_DATA_TYPE, DML_TENSOR_FLAG_NONE, dimensions, strides}
BNNSFullyConnectedLayerParameters.weightsweights logicl order is oi {num_units, input_size}cldnn_fully_connected_desc.weights
62
biasinput[2]: a 1-D tensorthe size of 1-D tensor same as outputFeatureChannels
MPSCNNConvolutionDataSource.biasTerms
dimensions = {1, 1, output_batch_size, output_num_units}
bias_strides = {1, 1, 0, 1}
4-D tensor (NCHW)
DML_GEMM_OPERATOR_DESC.CTensor {DML_TENSOR_DATA_TYPE, DML_TENSOR_FLAG_NONE, dimensions, strides}
BNNSFullyConnectedLayerParameters.bias1-D {bias_num_units}cldnn_fully_connected_desc.bias
63
fused activation
input[3]: RELU, RELU1, RELU6MPSCNNConvolutionDescriptor.neuron
MPSCNNNeuronReLU
MPSCNNNeuronReLUN
DML_OPERATOR_ACTIVATION_RELU
DML_OPERATOR_ELEMENT_WISE_CLIP {-1, 1}
DML_OPERATOR_ELEMENT_WISE_CLIP {0, 6}
BNNSFullyConnectedLayerParameters.activationpost_ops::append_eltwise:
eltwise_relu
eltwise_bounded_relu
cldnn_fully_connected_desc.with_activation
[Note]: fused RELU1 and RELU6 are not supported
64
outputoutput[0]: the output tensorMPSImageDescriptor.{1, 1, output_size, outputFeatureChannels}
destinationImage of MPSNNGraph.encode
4-D tensor (NCHW)
DML_GEMM_OPERATOR_DESC.OutputTensor {DML_TENSOR_DATA_TYPE, DML_TENSOR_FLAG_NONE, dimensions, strides}
BNNSVectorDescriptor: Represents a vector of dimension size.
Each vector element is a scalar value, stored using the type specified in data_type
output logical order is nc: {output_batch_size,
output_num_units}
cldnn_fully_connected_desc.output_shape
65
66
Resize BilinearopANEURALNETWORKS_RESIZE_BILINEARMPSCNNUpsamplingBilinearNode DML_OPERATOR_UPSAMPLE_2D + DML_UPSAMPLE_2D_OPERATOR_DESCvImageVerticalShear and vImageHorizontalSheardnnl_resampling_forward_desc
alg_kind = dnnl_resampling_linear
[Note]: not supproted. (could implement by cldnn_custom_gpu_primitive_desc for custom opencl kernel)
67
inputinput[0]: a 4-D tensorMPSImageDescriptor.{numberOfImages,
featureChannels, width, height}
4-D tensor (NCHW)
DML_UPSAMPLE_2D_OPERATOR_DESC.InputTensor {DML_TENSOR_DATA_TYPE, DML_TENSOR_FLAG_NONE, dimensions, strides}
vImage_Buffersrc_desc of resampling_forward_desc
4-D tensor{N,C,H,W}
68
output heightinput[1]: height of output tensor
[Note]: It must be integral multiple for input height
[Note]: It must be integral multiple for input height
vImage_Buffer.heightthe height of output tensor
69
output widthinput[2]: width of output tensor
[Note]: It must be integral multiple for input width
[Note]: It must be integral multiple for input width
vImage_Buffer.widththe width of output tensor
70
outputoutput[0]: output tensorMPSImageDescriptor.{numberOfImages,
featureChannels, width, height}
4-D tensor (NCHW)
DML_UPSAMPLE_2D_OPERATOR_DESC.OutputTensor {DML_TENSOR_DATA_TYPE, DML_TENSOR_FLAG_NONE, dimensions, strides}
vImage_Bufferdst_desc of resampling_forward_desc
4D tensor {N,C,W,H}
71
72
TransposeopANEURALNETWORKS_TRANSPOSE
73
inputinput[0]: An n-D tensor
74
permutationAn optional 1-D Tensor
75
outputoutput[0]: 0: A tensor of the same type as input0.
76
77
BatchToSpaceopANEURALNETWORKS_BATCH_TO_SPACE_ND
78
inputInput[0]: An n-D tensor to be reshaped
Input[2]: An optional boolean scalar
79
block sizesInput[1]: A 1-D Tensor
80
outputOutput[0]: A tensor of the same type as input0.
81
82
Element-wise MaximumopANEURALNETWORKS_MAXIMUM
83
inputInputs[0]: A tensor.
Inputs[1]: A tensor of the same type and compatible dimensions with input0.
84
outputoutput[0]: A tensor of the same type as input0.
85
86
TanhopANEURALNETWORKS_TANH
87
inputinput[0]: A tensor
88
outputoutput[0]: A tensor of same shape as input0.
89
90
ArgmaxopANEURALNETWORKS_ARGMAXMPSNNReductionFeatureChannelsArgumentMaxNodeDML_OPERATOR_REDUCE+
DML_REDUCE_OPERATOR_DESC
cldnn_arg_max_min_desc
91
inputinput[0]: An n-D tensorMPSImageDescriptor.{numberOfImages,
featureChannels, width, height}
4-D tensor (NCHW)
DML_REDUCE_OPERATOR_DESC.InputTensor {DML_TENSOR_DATA_TYPE, DML_TENSOR_FLAG_NONE, dimensions, strides}
cldnn_arg_max_min_desc.input
92
axisinput[1]: An int32 scalar[Note]: only support argmax along depth channelDML_REDUCE_OPERATOR_DESC.Axiscldnn_arg_max_min_desc.axis
93
outputoutput[0]: An (n - 1)-D int32 tensor.MPSImageDescriptor.{numberOfImages,
featureChannels, width, height}
4-D tensor (NCHW)
DML_REDUCE_OPERATOR_DESC.InputTensor {DML_TENSOR_DATA_TYPE_UINT32, DML_TENSOR_FLAG_NONE, dimensions, strides}
cldnn_arg_max_min_desc.output
94
95
SigmoidopANEURALNETWORKS_LOGISTICMPSCNNNeuronSigmoidNodeDML_OPERATOR_ACTIVATION_SIGMOID+
DML_ACTIVATION_SIGMOID_OPERATOR_DESC
BNNSFilterCreateVectorActivationLayer
BNNSActivation = BNNSActivationSigmod
dnnl_eltwise_forward_desc
alg_kind = dnnl_eltwise_logistic
cldnn_activation_desc with activation_desc.activation_func = activation_logistic
96
inputinput[0]: A tensorMPSImageDescriptor.{numberOfImages,
featureChannels, width, height}
4-D tensor (NCHW)
DML_REDUCE_OPERATOR_DESC.InputTensor {DML_TENSOR_DATA_TYPE, DML_TENSOR_FLAG_NONE, dimensions, strides}
BNNSVectorDescription{size, data_type,
data_scale, data_bias}
size represents vector dimension
src_desc
4-D tensor {N,C,H,W}
cldnn_activation_desc.input
97
outputoutput[0]: The output tensor of same shape as input0.MPSImageDescriptor.{numberOfImages,
featureChannels, width, height}
4-D tensor (NCHW)
DML_REDUCE_OPERATOR_DESC.InputTensor {DML_TENSOR_DATA_TYPE, DML_TENSOR_FLAG_NONE, dimensions, strides}
BNNSVectorDescription{size, data_type,
data_scale, data_bias}
size represents vector dimension
desc_desc
4-D tensor {N,C,H,W}
cldnn_activation_desc.output
98
99
PreluopANEURALNETWORKS_PRELUMPSCNNNeuronPReLUNodeDML_OPERATOR_ACTIVATION_PARAMETERIZED_RELU+
DML_ACTIVATION_PARAMETERIZED_RELU_OPERATOR_DESC
cldnn_activation_desc with activation_desc.activation_func = activation_relu_negative_slope
100
inputinput[0]: A tensorMPSImageDescriptor.{numberOfImages,
featureChannels, width, height}
4-D tensor (NCHW)
DML_REDUCE_OPERATOR_DESC.InputTensor {DML_TENSOR_DATA_TYPE, DML_TENSOR_FLAG_NONE, dimensions, strides}
cldnn_activation_desc.input