A | B | C | D | E | F | G | H | I | J | K | L | M | N | O | P | Q | R | S | T | U | V | W | X | Y | Z | AA | AB | AC | ||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
1 | NNAPI | MPS | DirectML | BNNS | DNNL (MKL-DNN) | clDNN | ONNX | |||||||||||||||||||||||
2 | Tensor Type | Float32 | ANEURALNETWORKS_TENSOR_FLOAT32 | MPSImageFeatureChannelFormatFloat32 (MPS uses float 16 internally) | DML_TENSOR_DATA_TYPE_FLOAT32 | BNNSDataTypeFloat32 | memory::data_type:f32 | cldnn_f32 | float32 (Tensor Element Types) | |||||||||||||||||||||
3 | Float16 | ANEURALNETWORKS_TENSOR_FLOAT16 | MPSImageFeatureChannelFormatFloat16 | DML_TENSOR_DATA_TYPE_FLOAT16 | BNNSDataTypeFloat16 | memory::data_type:f16 | cldnn_f16 | float16 (Tensor Element Types) | ||||||||||||||||||||||
4 | Quantized Int8 | ANEURALNETWORKS_TENSOR_QUANT8_ASYMM real_value = (integer_value - zeroPoint) * scale | [Note]: not supported | DML_TENSOR_DATA_TYPE_UINT8 | [Note]: not supported | memory::data_type:s8 | [Note]: not supported | [Note]: not mentioned | ||||||||||||||||||||||
5 | ||||||||||||||||||||||||||||||
6 | Convolution | op | ANEURALNETWORKS_CONV_2D | MPSCNNConvolutionNode + MPSCNNConvolutionDescriptor | DML_OPERATOR_CONVOLUTION+ DML_CONVOLUTION_OPERATOR_DESC | BNNSFilterCreateConvolutionLayer | convolution_forward | cldnn_convolution_desc | Conv | |||||||||||||||||||||
7 | input | 4-D tensor (NHWC) input[0]: [batches, height, width, depth_in] | 4-D tensor (NHWC) MPSImageDescriptor.{numberOfImages, featureChannels, width, height, MPSImageFeatureChannelFormatFloat16} sourceImage of MPSNNGraph.encode | 4-D tensor (NCHW) DML_CONVOLUTION_OPERATOR_DESC.InputTensor {DML_TENSOR_DATA_TYPE, DML_TENSOR_FLAG_NONE, dimensions, strides} | 4-D tensor (NCHW) BNNSImageStackDescription: P(c,x,y) at position(x,y) in channel c is stored in data[x + row_stride * y + image_stride * c] row_stride = width image_stride = row_stride * height | 4-D tensor (NCHW) src_desc of convolution_forward::desc | 4-D tensor (NHWC or NCHW) cldnn_convolution_desc.input | 4-D tensor (NCHW) Inputs["X"] : [batches, channels, height, width] [Note]: also support tensor > 4D with [N, C, D1, D2, ..., Dn] | ||||||||||||||||||||||
8 | filter | input[1]: [depth_out, filter_height, filter_width, depth_in] | weight[ outputChannels ][ kernelHeight ][ kernelWidth ][ inputChannels / groups ] MPSCNNConvolutionDescriptor.{kernelWidth, kernelHeight, inputFeatureChannels, outputFeatureChannels} MPSCNNConvolutionDataSource.weights | 4-D tensor [depth_out, depth_in, filter_height, filter_width] DML_CONVOLUTION_OPERATOR_DESC.FilterTensor {DML_TENSOR_DATA_TYPE, DML_TENSOR_FLAG_OWNED_BY_DML, dimensions, strides} | 4-D tensor [depth_out, depth_in, filter_height, filter_width] BNNSConvolutionLayerParameters.{in_channels, out_channels, k_width, k_height} Weight (o,i,kx,ky) for output (out_channels), input (in_channels) the kernel point (kx, ky) is stored in weights[kx + k_width * (ky + k_height * (i + in_channles * o))] | 4-D tensor [depth_out, depth_in, filter_height, filter_width] weights_desc of convolution_forward::desc | cldnn_convolution_desc.weights | Inputs["W"]: [M, C/group, kH, kW], where C is the number of channels, and kH and kW are the height and width of the kernel, and M is the number of feature maps. For common convolution, the group is 1 [Note]: also support tensor > 4D with size (M x C/group x k1 x k2 x ... x kn) | ||||||||||||||||||||||
9 | bias | input[2]: [depth_out] | MPSCNNConvolutionDataSource.biasTerms | 4-D tensor [1, depth_out, 1, 1] DML_CONVOLUTION_OPERATOR_DESC.BiasTensor {DML_TENSOR_DATA_TYPE, DML_TENSOR_FLAG_OWNED_BY_DML, dimensions, strides} | BNNSLayer BNNSConvolutionLayerParameters.bias | bias_desc of of convolution_forward::desc | cldnn_convolution_desc.bias | Inputs["B"]: 1D tensor [M], M is the number of feature maps | ||||||||||||||||||||||
10 | padding | explicit padding: input[3:6]: left, right, top, bottom implicit padding: input[3]: SAME, VALID | MPSCNNConvolutionNode.offset.{x, y} [Note]: right and bottom of explicit padding is not supported? [Note]: implicit padding is not supported? | DML_CONVOLUTION_OPERATOR_DESC.StartPadding {padding_top, padding_left} DML_CONVOLUTION_OPERATOR_DESC.EndPadding {padding_bottom, padding_right} [Note]: implicit padding is not supported? | BNNSConvolutionLayerParameters.x_padding BNNSConvolutionLayerParameters.y_padding [Note]: only support same left and right paddings, and same top and bottom paddings [Note]: implicit padding is not supported | padding_l of of convolution_forward::desc: [top, left] padding_r of convolution_forward::desc: [bottom, right] [Note]: implicit padding is not supported | cldnn_convolution_desc.input_offset [Note]: implicit padding is not supported | explicit padding: Attributes["pads"]: [x1_begin, x2_begin...x1_end, x2_end,...], where xi_begin the number of pixels added at the beginning of axis `i` and xi_end, the number of pixels added at the end of axis `i`, for 2D image, it's [top, bottom, left, right] implicit padding: Attribute["auto_pad"]: NOTSET, SAME_UPPER, SAME_LOWER or VALID [Note]: implicit padding's SAME_UPPER or SAME_LOWER are not mapped [Note]: implicit padding has DEPRECATION NOTE | ||||||||||||||||||||||
11 | stride | input[4:5]: stride_width, stride_height | MPSCNNConvolutionDescriptor.strideInPixelsX MPSCNNConvolutionDescriptor.strideInPixelsY | DML_CONVOLUTION_OPERATOR_DESC.Strides {stride_width, stride_height} | BNNSConvolutionLayerParameters.x_stride BNNSConvolutionLayerParameters.y_stride | strides of convolution_forward::desc: [stride_width, stride_height] | cldnn_convolution_desc.stride | Attributes["strides"]: list of ints, stride along each axis, for 2D image, stride_height, stride_width | ||||||||||||||||||||||
12 | fused activation | input[6]: RELU, RELU1, RELU6 | MPSCNNConvolutionDescriptor.neuron MPSCNNNeuronReLU MPSCNNNeuronReLUN [Note]: RELU1 min(1.f, max(-1.f, input)) not supported | DML_OPERATOR_ACTIVATION_RELU DML_OPERATOR_ELEMENT_WISE_CLIP {-1, 1} DML_OPERATOR_ELEMENT_WISE_CLIP {0, 6} | BNNSActivationFunctionRectifiedLinear BNNSActivationFunctionClamp: {min(max(x, alpha), beta)} | post_ops::append_eltwise: eltwise_relu eltwise_bounded_relu | cldnn_convolution_desc.with_activation [Note]: fused RELU1 and RELU6 are not supported | [Note]: not mentoined, might not be needed as IR level | ||||||||||||||||||||||
13 | dilation rate | input[11:12]: dilation_width, dilation_height | MPSCNNConvolutionDescriptor.{dilationRateX, dilationRateY} | DML_CONVOLUTION_OPERATOR_DESC.Dilations {dilation_width, dilation_height} | [Note]: not supported | dilates of convolution_forward::desc: [dilation_width, dilation_height] | cldnn_convolution_desc.dilation | Attributes["dilations"]: list of ints, dilation value along each axis of the filter | ||||||||||||||||||||||
14 | output | 4-D tensor (NHWC) output[0]: [batches, out_height, out_width, depth_out] | 4-D tensor (NHWC) MPSImageDescriptor.{numberOfImages, featureChannels, width, height} destinationImage of MPSNNGraph.encode | 4-D tensor (NCHW) DML_CONVOLUTION_OPERATOR_DESC.OutputTensor {DML_TENSOR_DATA_TYPE, DML_TENSOR_FLAG_NONE, dimensions, strides} | 4-D tensor (NCHW) BNNSImageStackDescription: P(c,x,y) at position(x,y) in channel c is stored in data[x + row_stride * y + image_stride * c] row_stride = width image_stride = row_stride * height | 4-D tensor (NCHW) dst_desc of convolution_forward::desc | 4-D tensor (HNWC or NCHW) cldnn_convolution_desc.with_output_size cldnn_convolution_desc.output_size Output tensor of primitive | Outputs["Y"]: for 4-D tensor (NCHW) | ||||||||||||||||||||||
15 | ||||||||||||||||||||||||||||||
16 | Depthwise Convolution (addtional to convolution) | op | ANEURALNETWORKS_DEPTHWISE_CONV_2D | MPSCNNConvolutionNode + MPSCNNDepthWiseConvolutionDescriptor | DML_OPERATOR_CONVOLUTION+ DML_CONVOLUTION_OPERATOR_DESC Set DML_CONVOLUTION_OPERATOR_DESC.GroupCount to in_channels | [Note]: not supported | convolution_forward with weights_format = dnnl_hwigo and group size = number of filters | Set cldnn_convolution_desc.split as depth_out | Use same Conv op with attributes["group"] equals to in_channels [Note]: not well documented in onnx doc, hints from https://github.com/onnx/tensorflow-onnx/blob/master/tf2onnx/tfonnx.py#L519 | |||||||||||||||||||||
17 | depthwise multiplier | input[6]: depthwise multiplier | MPSCNNDepthWiseConvolutionDescriptor.channelMultiplier | [Note] only tested multiplier as 1 | [Note]: not supported | [note] only multipier 1 is supported | [Note]: only support multiplier as 1 | out_channels = K * in_channels, where depthwise multiplier K [Note]: not well documented in onnx doc, hints from https://github.com/onnx/tensorflow-onnx/blob/master/tf2onnx/tfonnx.py#L519 | ||||||||||||||||||||||
18 | ||||||||||||||||||||||||||||||
19 | op | ANEURALNETWORKS_AVERAGE_POOL_2D | MPSCNNPoolingAverageNode | DML_OPERATOR_AVERAGE_POOLING + DML_AVERAGE_POOLING_OPERATOR_DESC | BNNSFilterCreatePoolingLayer BNNSPoolingLayerParameters.BNNSPoolingFunction = BNNSPoolingAverage | dnnl_pooling_forward alg kind = dnnl_pooling_avg | cldnn_pooling_desc.mode = cldnn_pooling_average | AveragePool | ||||||||||||||||||||||
20 | Average Pooling | input | 4-D tensor (NHWC) input[0]: [batches, height, width, depth_in] | 4-D tensor (NHWC) MPSImageDescriptor.{numberOfImages, featureChannels, width, height} sourceImage of MPSNNGraph.encode | 4-D tensor (NCHW) DML_AVERAGE_POOLING_OPERATOR_DESC.InputTensor {DML_TENSOR_DATA_TYPE, DML_TENSOR_FLAG_NONE, dimensions, strides} | 4-D tensor (NCWH) BNNSImageStackDescription: P(c,x,y) at position(x,y) in channel c is stored in data[x + row_stride * y + image_stride * c] row_stride = width image_stride = row_stride * height | 4-D tensor (NCHW) src_desc of pooling_forward::desc | 4-D tensor (NHWC or NCHW) cldnn_pooling_desc.input | 4-D tensor (NCHW) Inputs["X"]: [batches, channels, height, width] [Note]: also support tensor > 4D with [N, C, D1, D2, ..., Dn] | |||||||||||||||||||||
21 | padding | explicit padding: input[1:4]: left, right, top, bottom implicit padding: input[1]: SAME, VALID | MPSCNNPoolingAverageNode.offset.{x, y} [Note]: right and bottom of explicit padding is not supported [Note]: implicit padding is not supported | DML_AVERAGE_POOLING_OPERATOR_DESC.StartPadding {padding_top, padding_left} DML_AVERAGE_POOLING_OPERATOR_DESC.EndPadding {padding_bottom, padding_right} | BNNSPoolingLayerParameters.x_padding BNNSPoolingLayerParameters.y_padding [Note]: right and bottom of explicit padding is not supported [Note]: implicit padding is not supported | padding_l of of pooling_forward::desc: [top, left] padding_r of pooling_forward::desc: [bottom, right] [Note]: implicit padding is not supported | cldnn_pooling_desc.input_offset [Note]: implicit padding is not supported | "explicit padding: Attributes["pads"]: [x1_begin, x2_begin...x1_end, x2_end,...], where xi_begin the number of pixels added at the beginning of axis `i` and xi_end, the number of pixels added at the end of axis `i`, for 2D image, it's [top, bottom, left, right] implicit padding: Attributes["auto_pad"]: NOTSET, SAME_UPPER, SAME_LOWER or VALID [Note]: implicit padding's SAME_UPPER or SAME_LOWER are not mapped [Note]: implicit padding has DEPRECATION NOTE | ||||||||||||||||||||||
22 | stride | input[2:3]: stride_width, stride_height | MPSCNNPoolingAverageNode.{strideInPixelsX, strideInPixelsY} | DML_AVERAGE_POOLING_OPERATOR_DESC.Strides {stride_width, stride_height} | BNNSPoolingLayerParameters.x_stride BNNSPoolingLayerParameters.y_stride | strides of pooling_forward::desc: [stride_width, stride_height] | cldnn_pooling_desc.stride | Attributes["strides"] : list of ints, stride along each axis, for 2D image, stride_height, stride_width | ||||||||||||||||||||||
23 | filter size | input[4:5]: filter_width, filter_height | MPSCNNPoolingAverageNode.{kernelWidth, kernelHeight} | DML_AVERAGE_POOLING_OPERATOR_DESC.WindowSize {filter_width, filter_height} | BNNSPoolingLayerParameters.k_width, BNNSPoolingLayerParameters.k_height | 2D tensor [filter_width, filter_height] kernel of convolution_forward::desc | cldnn_pooling_desc.size | Attributes["kernel_shape"]: list of ints, the size of the kernel along each axis, for 2D image, filter_height, filter_width | ||||||||||||||||||||||
24 | fused activation | input[6]: RELU, RELU1, RELU6 | [Note]: not supported | DML_OPERATOR_ACTIVATION_RELU DML_OPERATOR_ELEMENT_WISE_CLIP {-1, 1} DML_OPERATOR_ELEMENT_WISE_CLIP {0, 6} | BNNSActivationFunctionRectifiedLinear BNNSActivationFunctionClamp: {min(max(x, alpha), beta)} | post_ops::append_eltwise: eltwise_relu eltwise_bounded_relu | [Note]: not supported | [Note]: not mentoined, might not be needed as IR level | ||||||||||||||||||||||
25 | output | 4-D tensor (NHWC) output[0]: [batches, out_height, out_width, depth] | 4-D tensor (NHWC) MPSImageDescriptor.{numberOfImages, featureChannels, width, height} destinationImage of MPSNNGraph.encode | 4-D tensor (NCHW) DML_AVERAGE_POOLING_OPERATOR_DESC.OutputTensor {DML_TENSOR_DATA_TYPE, DML_TENSOR_FLAG_NONE, dimensions, strides} | 4-D tensor(NCWH) BNNSImageStackDescription: P(c,x,y) at position(x,y) in channel c is stored in data[x + row_stride * y + image_stride * c] row_stride = width image_stride = row_stride * height | 4-D tensor (NCHW) dst_desc of pooling_forward::desc | 4-D tensor (HNWC or NCHW) cldnn_pooling_desc.with_output_size cldnn_pooling_desc.output_size Output tensor of primitive | Outputs["Y"]: for 4-D tensor (NCHW) | ||||||||||||||||||||||
26 | ||||||||||||||||||||||||||||||
27 | Max Pooling | op | ANEURALNETWORKS_MAX_POOL_2D | MPSCNNPoolingMaxNode | DML_OPERATOR_MAX_POOLING + DML_MAX_POOLING_OPERATOR_DESC | BNNSFilterCreatePoolingLayer BNNSPoolingLayerParameters.BNNSPoolingFunction = BNNSPoolingMax | dnnl_pooling_forward alg kind = dnnl_pooling_max | cldnn_pooling_desc.mode = cldnn_pooling_max | MaxPool | |||||||||||||||||||||
28 | [Note]: other parameters are same as Average Pooling 2D | [Note]: Attributes["storage_order"] is not mapped (issue: https://github.com/onnx/onnx/issues/1370) [Note]: Optional Outputs["Indices"] is not mapped | ||||||||||||||||||||||||||||
29 | ||||||||||||||||||||||||||||||
30 | Softmax | op | ANEURALNETWORKS_SOFTMAX | MPSCNNSoftMaxNode | DML_OPERATOR_ACTIVATION_SOFTMAX + DML_ACTIVATION_SOFTMAX_OPERATOR_DESC | BNNSFilterCreateVectorActivationLayer BNNSActivation = BNNSActivationFunctionSoftmax | dnnl_softmax_forward_desc | cldnn_softmax_desc | Softmax | |||||||||||||||||||||
31 | input | input[0]: 2-D or 4-D tensor (NHWC) | sourceImage: 2-D or 4-D tensor (NHWC) | 4-D tensor (NCHW) DML_ACTIVATION_SOFTMAX_OPERATOR_DESC.InputTensor {DML_TENSOR_DATA_TYPE, DML_TENSOR_FLAG_NONE, dimensions, strides} | BNNSVectorDescription { size, data_type, data_scale, data_bias} size represenrs vector dimension | data_desc of dnnl_softmax_forward_desc softmax_axis axis over which softmax is computed | Up to 4-D tensor (HNWC or NCHW) cldnn_softmax_desc.with_output_size cldnn_softmax_desc.output_size | Inputs["input"]: 2-D tensor [batch_size, input_feature_dimensions] [Note]: document says input "X" (issue: https://github.com/onnx/onnx/issues/1369) does not need to explicitly be a 2D vector; rather, it will be coerced into one. Attributes["axis"] describes the axis of the inputs when coerced to 2D | ||||||||||||||||||||||
32 | beta | input[1]: float32 value | [Note]: only support beta as 1.0 | [Note]: only support beta as 1.0 | BNNSActivation.beta | [Note]: only support beta as 1.0 | [Note]: only support beta as 1.0 | [Note]: not supported | ||||||||||||||||||||||
33 | output | output[0]: output tensor of same shape as input0 | destinationImage: NHWC | 4-D tensor (NCHW) DML_ACTIVATION_SOFTMAX_OPERATOR_DESC.OutputTensor {DML_TENSOR_DATA_TYPE, DML_TENSOR_FLAG_NONE, dimensions, strides} | BNNSVectorDescription { size, data_type, data_scale, data_bias} size represenrs vector dimension | data_desc of dnnl_softmax_forward_desc softmax_axis axis over which softmax is computed | Output tensor of primitive | Ouputs["output"]: the output values with the same shape as input tensor | ||||||||||||||||||||||
34 | ||||||||||||||||||||||||||||||
35 | Element-wise Add | op | ANEURALNETWORKS_ADD | MPSCNNAddNode | DML_OPERATOR_ELEMENT_WISE_ADD + DML_ELEMENT_WISE_ADD_OPERATOR_DESC | vDSP_vadd | dnnl_sum_primitive_desc | cldnn_eltwise_desc.mode = cldnn_eltwise_sum | Add | |||||||||||||||||||||
36 | input | input[0, 1]: tensor 0 and 1, up to 4-D | MPSImageDescriptor.{numberOfImages, featureChannels, width, height} primaryImage of MPSNNGraph.encode, up to 4-D secondaryImage of MPSNNGraph.encode, up to 4-D | 4-D tensor (NCHW) DML_ELEMENT_WISE_ADD_OPERATOR_DESC.InputTensor {DML_TENSOR_DATA_TYPE, DML_TENSOR_FLAG_NONE, dimensions, strides} | src[0](dnnl_query_src_md.0) src[1](dnnl_query_src_md,1) ... src[n-1](dnnl_query_src_md,n-1) | 4-D tensor (HNWC or NCHW) cldnn_eltwise_desc.input (size = 2) | Inputs["A"] and Inputs["B"]: tensor [Note]: no limitation of tensor dimension | |||||||||||||||||||||||
37 | scale | [Note]: not supported | MPSCNNArithmeticNode.{primaryScale, secondaryScale} | [Note]: not supported | scales of sum_primitive_desc- vector pf scales tu multiply data in each source memory by | [Note]: not supported | [Note]: not suported | |||||||||||||||||||||||
38 | bias | [Note]: not supported | MPSCNNArithmeticNode.bias | [Note]: not supported | [Note]: not supported | [Note]: not supported | [Note]: not suported | |||||||||||||||||||||||
39 | fused activation | input[2]: RELU, RELU1, RELU6 | [Note]: not supported | DML_OPERATOR_ACTIVATION_RELU DML_OPERATOR_ELEMENT_WISE_CLIP {-1, 1} DML_OPERATOR_ELEMENT_WISE_CLIP {0, 6} | [Note]: not supported | cldnn_eltwise_desc.with_activation [Note]: fused RELU1 and RELU6 are not supported | [Note]: not mentoined, might not be needed as IR level | |||||||||||||||||||||||
40 | output | output[0]: a tensor | MPSImageDescriptor.{numberOfImages, featureChannels, width, height} destinationImage of MPSNNGraph.encode | 4-D tensor (NCHW) DML_ELEMENT_WISE_ADD_OPERATOR_DESC.OutputTensor {DML_TENSOR_DATA_TYPE, DML_TENSOR_FLAG_NONE, dimensions, strides} | dst_desc of sum_primitive_desc | Output tensor of primitive | Outputs["C"]: tensor | |||||||||||||||||||||||
41 | ||||||||||||||||||||||||||||||
42 | Element-wise Multiply | op | ANEURALNETWORKS_MUL | MPSNNMultiplicationNode | DML_OPERATOR_ELEMENT_WISE_MULTIPLY + DML_ELEMENT_WISE_MULTIPLY_OPERATOR_DESC | vDSP_vmul | [Note]: not supported | cldnn_eltwise_desc.mode = cldnn_eltwise_prod | Mul | |||||||||||||||||||||
43 | input | input[0, 1]: tensor 0 and 1, up to 4-D | MPSImageDescriptor.{numberOfImages, featureChannels, width, height} primaryImage of MPSNNGraph.encode, up to 4-D secondaryImage of MPSNNGraph.encode, up to 4-D | 4-D tensor (NCHW) DML_ELEMENT_WISE_MULTIPLY_OPERATOR_DESC.InputTensor {DML_TENSOR_DATA_TYPE, DML_TENSOR_FLAG_NONE, dimensions, strides} | 4-D tensor (HNWC or NCHW) cldnn_eltwise_desc.input (size = 2) | Inputs["A"] and Inputs["B"]: tensor [Note]: no limitation of tensor dimension | ||||||||||||||||||||||||
44 | scale | [Note]: not supported | MPSCNNArithmeticNode.{primaryScale, secondaryScale} | [Note]: not supported | [Note]: not supported | [Note]: not suported | ||||||||||||||||||||||||
45 | bias | [Note]: not supported | MPSCNNArithmeticNode.bias | [Note]: not supported | [Note]: not supported | [Note]: not suported | ||||||||||||||||||||||||
46 | fused activation | input[2]: RELU, RELU1, RELU6 | [Note]: not supported | DML_OPERATOR_ACTIVATION_RELU DML_OPERATOR_ELEMENT_WISE_CLIP {-1, 1} DML_OPERATOR_ELEMENT_WISE_CLIP {0, 6} | cldnn_eltwise_desc.with_activation [Note]: fused RELU1 and RELU6 are not supported | [Note]: not mentoined, might not be needed as IR level | ||||||||||||||||||||||||
47 | output | output[0]: a tensor | MPSImageDescriptor.{numberOfImages, featureChannels, width, height} destinationImage of MPSNNGraph.encode | 4-D tensor (NCHW) DML_ELEMENT_WISE_MULTIPLY_OPERATOR_DESC.OutputTensor {DML_TENSOR_DATA_TYPE, DML_TENSOR_FLAG_NONE, dimensions, strides} | Output tensor of primitive | Outputs["C"]: tensor | ||||||||||||||||||||||||
48 | ||||||||||||||||||||||||||||||
49 | Concatenation | op | ANEURALNETWORKS_CONCATENATION | MPSNNConcatenationNode | DML_OPERATOR_JOIN + DML_JOIN_OPERATOR_DESC | [Note]: not supported The pixel stored in BNNS is by the channel. When the concatenation axis is 4, it means concatenation by channel. So as a workaround, memcpy is used for concatenation. | dnnl_concat_primitive_desc | cldnn_concatenation_desc | Concat | |||||||||||||||||||||
50 | inputs | input[0 ~ n-1]: The list of n input tensors | MPSNNImageNode of input tensors | DML_JOIN_OPERATOR_DESC.InputTensors DML_JOIN_OPERATOR_DESC.InputCount | src[0](dnnl_query_src_md.0) src[1](dnnl_query_src_md,1) ... src[n-1](dnnl_query_src_md,n-1) | cldnn_concatenation_desc.input (size 2 - n) | Inputs["inputs"]: list of tensors | |||||||||||||||||||||||
51 | axis | input[n]: specifying the concatenation axis | [Note]: only support concatenation along depth channel | DML_JOIN_OPERATOR_DESC.Axis | concat_dimension | cldnn_concatenation_desc.axis | Attributes["axis"]: which axis to concat on | |||||||||||||||||||||||
52 | output | output[0]: the output tensor | 4-D tensor (NHWC) MPSImageDescriptor.{numberOfImages, featureChannels, width, height} destinationImage of MPSNNGraph.encode | 4-D tensor (NCHW) DML_JOIN_OPERATOR_DESC.OutputTensor {DML_TENSOR_DATA_TYPE, DML_TENSOR_FLAG_NONE, dimensions, strides} | dst(dnnl_query_dst_md, 0) | Output tensor of primitive | Outputs["concat_result"]: concatenated tensor | |||||||||||||||||||||||
53 | ||||||||||||||||||||||||||||||
54 | Reshape | op | ANEURALNETWORKS_RESHAPE | MPSNNReshape | DML_OPERATOR_CAST + DML_CAST_OPERATOR_DESC | [Note]: not supported | dnnl_reorder_primitive_desc | cldnn_reshape_desc | Reshape | |||||||||||||||||||||
55 | input | tensor, up to 4-D input[0] | MPSImageDescriptor.{numberOfImages, featureChannels, width, height} sourceImage of MPSNNGraph.encode, up to 4-D | 4-D tensor (NCHW) DML_CAST_OPERATOR_DESC.InputTensor {DML_TENSOR_DATA_TYPE, DML_TENSOR_FLAG_NONE, dimensions, strides} | src_desc of reorder_primitive_desc 4-D tensor {N,C,H,W} | cldnn_reshape_desc.input | Inputs["data"]: tensor | |||||||||||||||||||||||
56 | output shape | input[1]: A 1-D tensor of int32 defining the shape of the output tensor. | [Note]: no need, output shape specificed by destination image | [Note]: no need, output shape specificed by destination tensor | [Note]: no need, output shape specificed by destination image | cldnn_reshape_desc.output_shape | Inputs["shape"]: tensor(int64) | |||||||||||||||||||||||
57 | output | output[0]: The output tensor, of shape specified by the input shape. | MPSImageDescriptor.{numberOfImages, featureChannels, width, height} destinationImage of MPSNNGraph.encode, up to 4-D | 4-D tensor (NCHW) DML_CAST_OPERATOR_DESC.OutputTensor {DML_TENSOR_DATA_TYPE, DML_TENSOR_FLAG_NONE, dimensions, strides} | dst_desc of reorder_primitive_desc | Output tensor of primitive | Outputs["reshaped"]: reshaped tensor | |||||||||||||||||||||||
58 | 4-D tensor{NCWH} | |||||||||||||||||||||||||||||
59 | Fully Connected | op | ANEURALNETWORKS_FULLY_CONNECTED | MPSCNNFullyConnectedNode | DML_OPERATOR_GEMM + DML_GEMM_OPERATOR_DESC | BNNSFilterCreateFullyConnectedLayer | dnnl_inner_product_forward_desc | cldnn_fully_connected_desc | ||||||||||||||||||||||
60 | input | input[0]: a tensor of at least rank 2 | Reshape to MPSImageDescriptor.{1, 1 product(dimensions) / weights[1], weights[1]} | Reshape {n, c, h, w} to {1, 1, input_batch_size, input_size} dimensions = {1, 1, input_batch_size, input_size} 4-D tensor (NCHW) DML_GEMM_OPERATOR_DESC.InputTensor {DML_TENSOR_DATA_TYPE, DML_TENSOR_FLAG_NONE, dimensions, strides}" | BNNSVectorDescriptor: Represents a vector of dimension size. Each vector element is a scalar value, stored using the type specified in data_type. | input logical order is nc {input_batch_size, input_size) | cldnn_fully_connected_desc.input (reshape to 2D tensor) | |||||||||||||||||||||||
61 | weights | input[1]: a 2-D tensor | MPSCNNConvolutionDescriptor.{1, 1, weights[0], weights[1]} MPSCNNConvolutionDataSource.weights | 4-D tensor (NCHW) DML_GEMM_OPERATOR_DESC.BTensor {DML_TENSOR_DATA_TYPE, DML_TENSOR_FLAG_NONE, dimensions, strides} | BNNSFullyConnectedLayerParameters.weights | weights logicl order is oi {num_units, input_size} | cldnn_fully_connected_desc.weights | |||||||||||||||||||||||
62 | bias | input[2]: a 1-D tensor | the size of 1-D tensor same as outputFeatureChannels MPSCNNConvolutionDataSource.biasTerms | dimensions = {1, 1, output_batch_size, output_num_units} bias_strides = {1, 1, 0, 1} 4-D tensor (NCHW) DML_GEMM_OPERATOR_DESC.CTensor {DML_TENSOR_DATA_TYPE, DML_TENSOR_FLAG_NONE, dimensions, strides} | BNNSFullyConnectedLayerParameters.bias | 1-D {bias_num_units} | cldnn_fully_connected_desc.bias | |||||||||||||||||||||||
63 | fused activation | input[3]: RELU, RELU1, RELU6 | MPSCNNConvolutionDescriptor.neuron MPSCNNNeuronReLU MPSCNNNeuronReLUN | DML_OPERATOR_ACTIVATION_RELU DML_OPERATOR_ELEMENT_WISE_CLIP {-1, 1} DML_OPERATOR_ELEMENT_WISE_CLIP {0, 6} | BNNSFullyConnectedLayerParameters.activation | post_ops::append_eltwise: eltwise_relu eltwise_bounded_relu | cldnn_fully_connected_desc.with_activation [Note]: fused RELU1 and RELU6 are not supported | |||||||||||||||||||||||
64 | output | output[0]: the output tensor | MPSImageDescriptor.{1, 1, output_size, outputFeatureChannels} destinationImage of MPSNNGraph.encode | 4-D tensor (NCHW) DML_GEMM_OPERATOR_DESC.OutputTensor {DML_TENSOR_DATA_TYPE, DML_TENSOR_FLAG_NONE, dimensions, strides} | BNNSVectorDescriptor: Represents a vector of dimension size. Each vector element is a scalar value, stored using the type specified in data_type | output logical order is nc: {output_batch_size, output_num_units} | cldnn_fully_connected_desc.output_shape | |||||||||||||||||||||||
65 | ||||||||||||||||||||||||||||||
66 | Resize Bilinear | op | ANEURALNETWORKS_RESIZE_BILINEAR | MPSCNNUpsamplingBilinearNode | DML_OPERATOR_UPSAMPLE_2D + DML_UPSAMPLE_2D_OPERATOR_DESC | vImageVerticalShear and vImageHorizontalShear | dnnl_resampling_forward_desc alg_kind = dnnl_resampling_linear | [Note]: not supproted. (could implement by cldnn_custom_gpu_primitive_desc for custom opencl kernel) | ||||||||||||||||||||||
67 | input | input[0]: a 4-D tensor | MPSImageDescriptor.{numberOfImages, featureChannels, width, height} | 4-D tensor (NCHW) DML_UPSAMPLE_2D_OPERATOR_DESC.InputTensor {DML_TENSOR_DATA_TYPE, DML_TENSOR_FLAG_NONE, dimensions, strides} | vImage_Buffer | src_desc of resampling_forward_desc 4-D tensor{N,C,H,W} | ||||||||||||||||||||||||
68 | output height | input[1]: height of output tensor | [Note]: It must be integral multiple for input height | [Note]: It must be integral multiple for input height | vImage_Buffer.height | the height of output tensor | ||||||||||||||||||||||||
69 | output width | input[2]: width of output tensor | [Note]: It must be integral multiple for input width | [Note]: It must be integral multiple for input width | vImage_Buffer.width | the width of output tensor | ||||||||||||||||||||||||
70 | output | output[0]: output tensor | MPSImageDescriptor.{numberOfImages, featureChannels, width, height} | 4-D tensor (NCHW) DML_UPSAMPLE_2D_OPERATOR_DESC.OutputTensor {DML_TENSOR_DATA_TYPE, DML_TENSOR_FLAG_NONE, dimensions, strides} | vImage_Buffer | dst_desc of resampling_forward_desc 4D tensor {N,C,W,H} | ||||||||||||||||||||||||
71 | ||||||||||||||||||||||||||||||
72 | Transpose | op | ANEURALNETWORKS_TRANSPOSE | |||||||||||||||||||||||||||
73 | input | input[0]: An n-D tensor | ||||||||||||||||||||||||||||
74 | permutation | An optional 1-D Tensor | ||||||||||||||||||||||||||||
75 | output | output[0]: 0: A tensor of the same type as input0. | ||||||||||||||||||||||||||||
76 | ||||||||||||||||||||||||||||||
77 | BatchToSpace | op | ANEURALNETWORKS_BATCH_TO_SPACE_ND | |||||||||||||||||||||||||||
78 | input | Input[0]: An n-D tensor to be reshaped Input[2]: An optional boolean scalar | ||||||||||||||||||||||||||||
79 | block sizes | Input[1]: A 1-D Tensor | ||||||||||||||||||||||||||||
80 | output | Output[0]: A tensor of the same type as input0. | ||||||||||||||||||||||||||||
81 | ||||||||||||||||||||||||||||||
82 | Element-wise Maximum | op | ANEURALNETWORKS_MAXIMUM | |||||||||||||||||||||||||||
83 | input | Inputs[0]: A tensor. Inputs[1]: A tensor of the same type and compatible dimensions with input0. | ||||||||||||||||||||||||||||
84 | output | output[0]: A tensor of the same type as input0. | ||||||||||||||||||||||||||||
85 | ||||||||||||||||||||||||||||||
86 | Tanh | op | ANEURALNETWORKS_TANH | |||||||||||||||||||||||||||
87 | input | input[0]: A tensor | ||||||||||||||||||||||||||||
88 | output | output[0]: A tensor of same shape as input0. | ||||||||||||||||||||||||||||
89 | ||||||||||||||||||||||||||||||
90 | Argmax | op | ANEURALNETWORKS_ARGMAX | MPSNNReductionFeatureChannelsArgumentMaxNode | DML_OPERATOR_REDUCE+ DML_REDUCE_OPERATOR_DESC | cldnn_arg_max_min_desc | ||||||||||||||||||||||||
91 | input | input[0]: An n-D tensor | MPSImageDescriptor.{numberOfImages, featureChannels, width, height} | 4-D tensor (NCHW) DML_REDUCE_OPERATOR_DESC.InputTensor {DML_TENSOR_DATA_TYPE, DML_TENSOR_FLAG_NONE, dimensions, strides} | cldnn_arg_max_min_desc.input | |||||||||||||||||||||||||
92 | axis | input[1]: An int32 scalar | [Note]: only support argmax along depth channel | DML_REDUCE_OPERATOR_DESC.Axis | cldnn_arg_max_min_desc.axis | |||||||||||||||||||||||||
93 | output | output[0]: An (n - 1)-D int32 tensor. | MPSImageDescriptor.{numberOfImages, featureChannels, width, height} | 4-D tensor (NCHW) DML_REDUCE_OPERATOR_DESC.InputTensor {DML_TENSOR_DATA_TYPE_UINT32, DML_TENSOR_FLAG_NONE, dimensions, strides} | cldnn_arg_max_min_desc.output | |||||||||||||||||||||||||
94 | ||||||||||||||||||||||||||||||
95 | Sigmoid | op | ANEURALNETWORKS_LOGISTIC | MPSCNNNeuronSigmoidNode | DML_OPERATOR_ACTIVATION_SIGMOID+ DML_ACTIVATION_SIGMOID_OPERATOR_DESC | BNNSFilterCreateVectorActivationLayer BNNSActivation = BNNSActivationSigmod | dnnl_eltwise_forward_desc alg_kind = dnnl_eltwise_logistic | cldnn_activation_desc with activation_desc.activation_func = activation_logistic | ||||||||||||||||||||||
96 | input | input[0]: A tensor | MPSImageDescriptor.{numberOfImages, featureChannels, width, height} | 4-D tensor (NCHW) DML_REDUCE_OPERATOR_DESC.InputTensor {DML_TENSOR_DATA_TYPE, DML_TENSOR_FLAG_NONE, dimensions, strides} | BNNSVectorDescription{size, data_type, data_scale, data_bias} size represents vector dimension | src_desc 4-D tensor {N,C,H,W} | cldnn_activation_desc.input | |||||||||||||||||||||||
97 | output | output[0]: The output tensor of same shape as input0. | MPSImageDescriptor.{numberOfImages, featureChannels, width, height} | 4-D tensor (NCHW) DML_REDUCE_OPERATOR_DESC.InputTensor {DML_TENSOR_DATA_TYPE, DML_TENSOR_FLAG_NONE, dimensions, strides} | BNNSVectorDescription{size, data_type, data_scale, data_bias} size represents vector dimension | desc_desc 4-D tensor {N,C,H,W} | cldnn_activation_desc.output | |||||||||||||||||||||||
98 | ||||||||||||||||||||||||||||||
99 | Prelu | op | ANEURALNETWORKS_PRELU | MPSCNNNeuronPReLUNode | DML_OPERATOR_ACTIVATION_PARAMETERIZED_RELU+ DML_ACTIVATION_PARAMETERIZED_RELU_OPERATOR_DESC | cldnn_activation_desc with activation_desc.activation_func = activation_relu_negative_slope | ||||||||||||||||||||||||
100 | input | input[0]: A tensor | MPSImageDescriptor.{numberOfImages, featureChannels, width, height} | 4-D tensor (NCHW) DML_REDUCE_OPERATOR_DESC.InputTensor {DML_TENSOR_DATA_TYPE, DML_TENSOR_FLAG_NONE, dimensions, strides} | cldnn_activation_desc.input |