List of Hardware Backends Supported by MindSpore Lite

Operator Names	Operator Functions	CPU	Kirin NPU	GPU (Mali/Adreno)	Ascend
Abs	Element-wise calculate the absolute value	FP16 FP32 Int32 Int8 UInt8	FP16	FP16 FP32	FP16
AbsGrad	Compute the gradient of the absolute value function	FP32	-	-
Activation	Activation functions	FP16 FP32 Int32 Int8 UInt8	FP16	FP16 FP32	FP16
ActivationGrad	Calculate the gradient of a specific activation function	FP16 FP32	-	-
Adam	Executing a single parameter update step of the Adam optimizer	FP32	-	-
AddFusion	Element-wise addition computation	FP16 FP32 Int32 Int8 UInt8 Bool	FP16	FP16 FP32 Int8	FP16
AdderFusion	Addition-based convolution operation	FP32	-	-
AddGrad	Compute the gradient of the addition operation	FP32	-	-
AddN	Perform element-wise addition on N input tensors of identical shape and data type.	FP16 FP32	-	-
Affine	Perform an affine transformation on the input tensor.	FP32	-	-	FP16
All	Determine whether all elements in the tensor are True (non-zero) along the specified dimension.	FP32	-	-
AllGather	Distributed collection communication operations	FP32	-	-
ApplyMomentum	Execute a single parameter update step of stochastic gradient descent for momentum.	FP32	-	-	FP16
Assert	Assertion	FP16 FP32 Bool	-	-
Assign	Assign a value to a variable	FP32	-	-	FP16
ArgmaxFusion	Find the maximum value in a given dimension	FP16 FP32 Int8 UInt8	FP16	FP16 FP32	FP16
ArgminFusion	Find the minimum value in a given dimension	FP16 FP32 Int8 UInt8	-	FP16 FP32	FP16
AvgPoolFusion	Average pooling	FP16 FP32 Int8 UInt8	FP16	FP16 FP32	FP16
AvgPoolGrad	Compute the gradients for the average pooling layer	FP16 FP32	-	-
BatchNorm	Batch normalization	FP16 FP32 Int8 UInt8	-	FP16 FP32	FP16
BatchNormGrad	Compute the gradient of the batch normalization layer	FP16 FP32	-	-
BatchToSpace	Inverse operation of space-to-batch transformation	FP32 Int8 UInt8	-	FP16 FP32
BatchToSpaceND	ND universal version of BatchToSpace	FP16 FP32 Int8 UInt8	-	FP16 FP32
BiasAdd	Add the bias vector to the input tensor	FP16 FP32 Int8 UInt8	-	FP16 FP32	FP16
BiasAddGrad	The gradient of the BiasAdd operation	FP16 FP32	-	-
BinaryCrossEntropy	Calculate the binary cross-entropy loss	FP32	-	-	FP16
BinaryCrossEntropyGrad	Calculate the gradient of the binary cross-entropy loss function	FP32	-	-
BroadcastTo	Expansion of dimensions	FP16 FP32 Int32 Bool	-	-
Call	Call a subgraph or function	FP16 FP32 Int32 Bool	-	-	FP16
Cast	Data type conversion	FP16 FP32 Int32 Int8 UInt8 Bool	FP16	FP16 FP32	FP16
Ceil	Round up to the nearest integer	FP16 FP32 Int8 UInt8	FP16	FP16 FP32	FP16
Clip	Restrict element ranges	FP32 Int32	-	-	FP16
Concat	Concatenated Tensor	FP16 FP32 Int32 Int8 UInt8 Bool	FP16	FP16 FP32 Int32	FP16
ConstantOfShape	Generate a tensor with the same shape as the input and fill it with the specified constant.	FP16 FP32 Int32	-	-
Conv2DFusion	2D convolution	FP16 FP32 Int8 UInt8	FP16	FP16 FP32	FP16
Conv2DBackpropFilterFusion	Compute the gradient of the convolution kernel with respect to the ordinary convolution operation.	FP16 FP32	-	-
Conv2DBackpropInputFusion	Compute the gradient of the input data with respect to the standard convolution operation.	FP16 FP32	-	-
Conv2dTransposeFusion	Perform transposed convolution operations	FP16 FP32 Int8 UInt8	FP16	FP16 FP32	FP16
Cos	Element-wise cosine calculation	FP16 FP32 Int8 UInt8	FP16	FP16 FP32	FP16
Crop	Crop a specified region from an input image or feature map.	FP16 FP32 Int32 Int8 UInt8	-	-
CropAndResize	Crop regions from the input image based on a set of bounding boxes, then resize each region to a uniform size.	FP32	FP16	-
CumSum	Cumulative sum of elements	FP32 Int32	-	-	FP16
CustomExtractFeatures	Extract operators based on custom feature	FP32	-	-
CustomNormalize	Custom normalized operator	FP32	-	-
CustomPredict	Custom prediction operator	FP32 Int32	-	-
DEConv2DGradFilter	Compute the gradient of the transposed convolution with respect to the convolution kernel.	FP32	-
DepthToSpace	Rearrange deep data into spatial dimensions	FP16 FP32 Int8 UInt8	-	FP16 FP32
DetectionPostProcess	Post-processing of object detection	FP32 Int8 UInt8	-	-
DivFusion	Element-wise division	FP16 FP32 Int32 Int8 UInt8	FP16	FP16 FP32	FP16
DivGrad	Compute the gradient of the division operation	FP32	-	-
Dropout	Randomly set some elements of the input tensor to zero.	FP16 FP32	-	-	FP16
DropoutGrad	Compute the gradient of the Dropout operation	FP16 FP32	-	-
DynamicQuant	Dynamically quantize floating-point tensors to uint8 type	FP32	-	-
Eltwise	Element-level operations	FP16 FP32 Int8 UInt8	FP16	FP16 FP32	FP16
Elu	Activation function, applying exponential correction to negative inputs	FP16 FP32	-	-	FP16
Equal	Determine whether inputs are equal	FP16 FP32 Int32 Int8 UInt8	FP16	FP16 FP32	FP16
EmbeddingLookupFusion	Optimized word embedding lookup, mapping integer indices to dense vectors	FP32	-	-
Erf	Error functions	FP16 FP32	-	-	FP16
ExpFusion	Element-wise exponentiation	FP16 FP32	-	FP16 FP32	FP16
ExpandDims	Insert a dimension of length 1 at the specified position	FP16 FP32 Int32 Int8 UInt8 Bool	FP16	FP16 FP32 Int32	FP16
Fill	Generate a tensor filled with the specified constant.	FP16 FP32 Int32 Bool	-	FP16 FP32	FP16
Flatten	Data is expanded by dimension	FP16 FP32 Int32	-	-	FP16
FlattenGrad	Compute the gradient of the Flatten operation	FP16 FP32	-	-
Floor	Round down to the nearest integer	FP16 FP32 Int8 UInt8	FP16	FP16 FP32	FP16
FloorDiv	Element-wise division down to the nearest integer	FP16 FP32 Int32	FP16	FP16 FP32
FloorMod	Element-wise modulo operation: the sign of the result matches that of the divisor.	FP16 FP32 Int32	FP16	FP16 FP32
FullConnection	Fully-connected layer	FP16 FP32 Int8 UInt8	FP16	FP16 FP32	FP16
FusedBatchNorm	Standardize the input	FP16 FP32 Int8 UInt8	FP16	-	FP16
GatherNd	Collect elements from the input tensor at specified positions based on the index tensor.	FP16 FP32 Int32 Int8 UInt8 Bool	-	FP16 FP32	FP16
Gather	Collect elements at specified index positions along a single dimension	FP16 FP32 Int32 Int8 UInt8 Bool	FP16	FP16 FP32 Int32	FP16
GatherD	Collect elements from the input tensor based on the index tensor.	FP16 FP32 Int32 Bool	-	-	FP16
GLU	Gated linear unit activation function splits the input into two parts and performs element-wise multiplication.	FP32	-	-
Greater	Perform element-wise comparison between two tensors, returning a logical result (True/False) indicating whether A > B.	FP16 FP32 Int32 Int8 UInt8	FP16	FP16 FP32	FP16
GreaterEqual	Perform element-wise comparison between two tensors, returning a logical result (True/False) indicating whether A ≥ B.	FP16 FP32 Int32 Int8 UInt8	FP16	FP16 FP32	FP16
GroupNormFusion	Group normalization for fusion optimization	FP32	-	-
GRU	Gated recurrent unit, simplified LSTM	FP16 FP32	-	-
HashtableLookup	Hash table lookup	FP32 Int32	-	-
InstanceNorm	Instance normalization	FP16 FP32	FP16	-	FP16
InvertPermutation	Inverted replacement index	FP16 FP32 Int32	-	-
IsFinite	Check whether each element in the tensor is finite (not inf/NaN)	FP32	-	-	FP16
L2NormalizeFusion	L2 normalization for fusion optimization	FP32 Int8 UInt8	-	-
LayerNormFusion	Layer normalization for fusion optimization	FP16 FP32 Int8	-	FP16 FP32	FP16
LayerNormGrad	Compute layer normalization gradients	FP16 FP32	-	-
LeakyReLU	Leaky ReLU activation function, which assigns a small slope to negative inputs.	FP16 FP32 Int8 UInt8	FP16	FP16 FP32	FP16
Less	Perform element-wise comparison between two tensors, returning a logical result indicating whether A < B.	FP16 FP32 Int32 Int8 UInt8	FP16	FP16 FP32	FP16
LessEqual	Perform element-wise comparison: A ≤ B, returns a Boolean tensor	FP16 FP32 Int32 Int8 UInt8	FP16	FP16 FP32	FP16
LRN	Local response normalization	FP32	-	-	FP16
Log	Element-wise calculate the logarithm	FP16 FP32 Int8 UInt8	FP16	FP16 FP32	FP16
Log1p	Calculate log(1+X)	FP32	-	-	FP16
LogGrad	Calculate the gradient of the logarithmic function	FP16 FP32	-	-
LogicalAnd	Element-wise logical AND operation	FP16 FP32 Int32 Bool	FP16	FP16 FP32
LogicalNot	Element-level logical NOT operation	FP16 FP32 Int8 UInt8 Bool	FP16	FP16 FP32
LogicalOr	Element-wise logical OR operation	FP16 FP32 Bool	FP16	FP16 FP32
LogSoftmax	Perform a softmax operation on the input vector, then take the logarithm of the softmax result.	FP16 FP32	-	-	FP16
LshProjection	Locality-sensitive hash projection	FP32	-	-
LSTM	Long-term and short-term memory network unit	FP16 FP32	-	-
LSTMGrad	Calculate the backward propagation gradient of the LSTM for the hidden state	FP32	-	-
LSTMGradData	Compute the backpropagation gradient of the LSTM for the input data	FP32	-	-
LSTMGradWeight	Calculate the backward propagation gradient of weights for the LSTM	FP32	-	-
MatMulFusion	Perform matrix multiplication on two inputs	FP16 FP32 Int8 UInt8	FP16	FP16 FP32	FP16
Maximum	Find the maximum value at the element level	FP16 FP32 Int32	FP16	FP16 FP32	FP16
MaximumGrad	Calculate the gradient of the maximum value function	FP16 FP32	-	-
MaxPoolFusion	Maximum pooling	FP16 FP32 Int8 UInt8	FP16	FP16 FP32	FP16
MaxPoolGrad	Compute the gradients for the max-pooling layer	FP16 FP32	-	-
Merge	Retrieve and output the first available tensor from a set of input tensors.	FP16 FP32	-	-
Minimum	Find the minimum value at the element level	FP16 FP32 Int32	FP16	FP16 FP32	FP16
MinimumGrad	Compute the gradient of the minimum value function	FP16 FP32	-	-
Mod	Return the remainder of the division operation	FP32 Int32	-	-	FP16
MulFusion	Element-wise multiplication	FP16 FP32 Int32 Int8 UInt8	FP16	FP16 FP32	FP16
MulGrad	Compute the gradient of the multiplication operation	FP32	-	-
Neg	Element-wise find negative numbers	FP16 FP32 Int32	FP16	FP16 FP32	FP16
NegGrad	Compute the gradient of the negation operation	FP16 FP32	-	-
NLLLoss	Compute the negative log-likelihood loss	FP32	-	-	FP16
NLLLossGrad	Compute the gradient of NLLLoss	FP32	-	-
NotEqual	Performs element-wise comparison between two tensors and returns the logical result indicating whether A != B.	FP16 FP32 Int32 Int8 UInt8	FP16	FP16 FP32
NonMaxSuppression	Non-maximum suppression	FP32	-	-	FP16
NonZero	Return the indices of all non-zero elements in the input tensor.	Bool	-	-	FP16
OneHot	Convert integer index tensors to one-hot encoding representations	FP16 FP32 Int32	-	FP16 FP32 Int32
OnesLike	Create a new tensor with the exact same shape as the input tensor X, but with all element values set to 1.	FP16 FP32 Int32	-	-	FP16
PadFusion	Add specified padding to the input tensor, to achieve the desired size.	FP16 FP32 Int8 UInt8	FP16	FP16 FP32	FP16
PartialFusion	Partial fusion	FP16 FP32 Int32 Bool	-	-
PowFusion	Element-wise exponentiation	FP16 FP32 Int8 UInt8	-	FP16 FP32	FP16
PowerGrad	Compute the gradient of the power operation	FP32	-	-
PriorBox	Generate prior boxes	FP32 Int8 UInt8	-	-	FP16
PReLUFusion	PRelu activation function	FP16 FP32	-	FP16 FP32	FP16
QuantDTypeCast	Perform quantitative data type conversion	FP16 FP32 Int8 UInt8	-	-
RaggedRange	Generate sequences with non-uniform intervals	FP16 FP32 Int32	-	-
RandomNormal	Generate a tensor whose values are randomly sampled from a normal distribution	FP16 FP32	-	-
RandomStandardNormal	Generate a random tensor following a standard normal distribution	FP16 FP32	-	-
Range	Generate elements within a specified range	FP16 FP32 Int32	-	-	FP16
Rank	Return the number of dimensions in the input tensor	FP16 FP32	-	-
RealDiv	Element-wise division	FP16 FP32	-	-	FP16
Reciprocal	Return reciprocals	FP16 FP32 Int8	FP16	-	FP16
ReduceFusion	Reduction operation	FP16 FP32 Int32 Int8 UInt8 Bool	FP16	FP16 FP32	FP16
ReduceScatter	Distributed operations: Input tensors are segmented and distributed across devices, with each device retaining only one segment of the results.	FP32	-	-
Reshape	Changing the shape of a tensor while keeping the total number of elements unchanged	FP16 FP32 Int32 Int8 UInt8 Bool	FP16	FP16 FP32 Int32	FP16
Resize	Upsample or resize the input tensor	FP16 FP32 Int8 UInt8	FP16	FP16 FP32
ResizeGrad	Compute the gradient for Resize	FP16 FP32	-	-
ReverseV2	Reverse the tensor along the specified axis	FP32 Int32	-	-
ReverseSequence	Partially reverse the variable-length sequence of the input tensor.	FP32	-	-	FP16
ROIPooling	Regional interest pooling	FP32	-	-	FP16
Round	Round to the nearest whole number	FP16 FP32 Int8 UInt8	FP16	FP16 FP32	FP16
Rsqrt	Element-wise compute square roots and reciprocals for normalization.	FP16 FP32 Int8 UInt8	FP16	FP16 FP32
RsqrtGrad	Calculate the gradient of the reciprocal of the square root	FP32	-	-
Select	Select elements from two tensors based on conditions	FP32 Bool	-	-
Selu	Self-normalizing index linear unit activation function	-	-	-
ScaleFusion	Fuse scaling operations with adjacent operators	FP16 FP32 Int8 UInt8	FP16	FP16 FP32	FP16
ScatterNd	Scatter values from the input tensor to specified positions in the output tensor based on the index.	FP16 FP32 Int32	-	-	FP16
ScatterNdUpdate	Update the value of the input data using the given value and the input index.	FP16 FP32 Int32	-	-
SGD	Stochastic gradient descent optimizer	FP32	-	-	FP16
Shape	Obtain the tensor shape	FP16 FP32 Int32 Int8 UInt8 Bool	-	FP16 FP32	FP16
SigmoidCrossEntropyWithLogits	Combine Sigmoid activation and cross-entropy loss	FP32	-	-	FP16
SigmoidCrossEntropyWithLogitsGrad	Compute the gradient of the cross-entropy loss with sigmoid	FP32	-	-	FP16
Sin	Element-wise calculation of sine	FP16 FP32 Int8 UInt8	FP16	FP16 FP32	FP16
Size	Obtain tensor dimension size	FP16 FP32 Int32	-	-	FP16
SliceFusion	Tensor slicing operation	FP16 FP32 Int32 Int8 UInt8	FP16	FP16 FP32	FP16
SkipGram	The core operation of the Skip-gram model, used for training word vectors	FP32	-	-
SmoothL1Loss	Smooth L1 Loss	FP32	-	-	FP16
SmoothL1LossGrad	Compute the gradient of the L1 loss	FP32	-	-
Softmax	Normalization operation	FP16 FP32 Int8 UInt8	FP16	FP16 FP32	FP16
SoftmaxGrad	Calculate the gradient of Softmax	FP32	-	-
Softplus	Smooth ReLU variants	FP16 FP32	-	-	FP16
SpaceToBatch	Move the values of the height and width dimensions to the depth dimension.	FP16 FP32 Int8 UInt8	-	FP16 FP32	FP16
SpaceToBatchND	Split spatial-dimensional data blocks into batch dimensions	FP16 FP32 Int8 UInt8	-	FP16 FP32
SpaceToDepth	Reorganize spatial data into depth channels	FP16 FP32	-	FP16 FP32
SparseToDense	Convert sparse representations to dense tensors	FP16 FP32 Int32	-	FP16 FP32 Int32
SparseSoftmaxCrossEntropyWithLogits	Softmax cross-entropy for sparse labels	FP32	-	-	FP16
Splice	Connect multiple slices or ranges of the input tensor along the specified axis.	FP16 FP32	-	-
Split	Split the input tensor into multiple smaller output tensors along the specified axis.	FP16 FP32 Int32 Int8 UInt8	FP16	FP16 FP32	FP16
SplitWithOverlap	Overlapped split tensor	FP16 FP32	-	-
Sqrt	Element-wise take the square root	FP16 FP32 Int8 UInt8	FP16	FP16 FP32	FP16
SqrtGrad	Calculate the gradient of the square root	FP32	-	-
Square	Element-wise square	FP16 FP32 Int8 UInt8	FP16	FP16 FP32	FP16
SquaredDifference	Element-wise compute (A-B)²	FP16 FP32	-	FP16 FP32
Squeeze	Remove dimension of size 1	FP16 FP32 Int32 Int8 UInt8 Bool	-	FP16 FP32 Int32
StridedSlice	Tensor slicing	FP16 FP32 Int32 Int8 UInt8	FP16	FP16 FP32	FP16
StridedSliceGrad	Compute the gradient of the slice operation	FP16 FP32	-	-
Stack	Stack multiple tensors along the new axis	FP16 FP32 Int32	-	FP16 FP32	FP16
SubFusion	Element-wise subtraction	FP16 FP32 Int32 Int8 UInt8	FP16	FP16 FP32	FP16
SubGrad	Calculate the gradient of subtraction	FP32	-	-
Switch	Select output branches based on Boolean conditions	FP16 FP32 Int32 Bool	-	-
SwitchLayer	Select different subnetwork branches for execution within the model	FP16 FP32 Int32 Bool	-	-
TensorListFromTensor	Convert a regular tensor into a list of tensors, splitting along the specified axis.	FP16 FP32 Int32	-	-
TensorListGetItem	Retrieve the tensor at the specified index position from the tensor list	FP16 FP32 Int32	-	-
TensorListReserve	Preallocate an empty array list, specifying the element data type and initial capacity.	FP16 FP32 Int32	-	-
TensorListSetItem	Insert a tensor into a specified position in a list of tensors	FP16 FP32 Int32	-	-
TensorListStack	Stack the list of tensors into a single regular tensor	FP16 FP32 Int32	-	-
TensorScatterAdd	Add the updated tensor values to the specified positions in the target tensor using the index.	FP32 Int32	-	-
TileFusion	Flatten the given matrix	FP16 FP32 Int32 Bool	FP16	-	FP16
TopKFusion	Return the top K elements from the input tensor.	FP16 FP32 Int32 Int8 UInt8	-	-	FP16
Transpose	Tensor transpose	FP16 FP32 Int32 Int8 Bool	FP16	FP16 FP32	FP16
UniformReal	Generate a random tensor following a uniform distribution	FP32 Int32	-	-
Unique	Returns the unique values in the input tensor, along with their indices and count.	FP16 FP32 Int32	-	-
UnsortedSegmentSum	Perform segmented summation on the tensor without requiring ordered segmented indices.	FP16 FP32 Int32	-	-
Unsqueeze	Add a new dimension to the input tensor	FP16 FP32 Int32 Int8 UInt8 Bool	FP16	FP16 FP32 Int32
Unstack	Split a tensor into multiple sub-tensors along a specified axis	FP16 FP32 Int32	-	-
Where	Element selection	FP16 FP32 Int32 Bool	-	-
ZerosLike	Generate a new tensor with the same shape as the input tensor but with all elements set to zero.	FP16 FP32 Int32	-	-