mindspore.nn
Neural Network Cell
For building predefined building blocks or computational units in neural networks.
Compared with the previous version, the added, deleted and supported platforms change information of mindspore.nn operators in MindSpore, please refer to the link API Updates .
Basic Block
API Name |
Description |
Supported Platforms |
The basic building block of neural networks in MindSpore. |
|
|
Base class for running the graph loaded from a MindIR. |
|
|
Base class for other losses. |
|
|
Base class for updating parameters. |
|
Container
API Name |
Description |
Supported Platforms |
Holds Cells in a list. |
|
|
Sequential Cell container. |
|
Wrapper Layer
API Name |
Description |
Supported Platforms |
A distributed optimizer. |
|
|
Dynamic Loss scale update cell. |
|
|
Update cell with fixed loss scaling value. |
|
|
Encapsulate training network. |
|
|
Cell to run for getting the next operation. |
|
|
This function splits the input at the 0th into interleave_num pieces and then performs the computation of the wrapped cell. |
|
|
Cell that updates parameter. |
|
|
Wrap the network with Micro Batch. |
|
|
The time distributed layer. |
|
|
Network training package class. |
|
|
Network training with loss scaling. |
|
|
Wraps the forward network with the loss function. |
|
|
Cell that returns the gradients. |
|
|
Cell with loss function. |
|
Convolutional Neural Network Layer
API Name |
Description |
Supported Platforms |
Calculates the 1D convolution on the input tensor. |
|
|
Calculates a 1D transposed convolution, which can be regarded as Conv1d for the gradient of the input, also called deconvolution (although it is not an actual deconvolution). |
|
|
Calculates the 2D convolution on the input tensor. |
|
|
Calculates a 2D transposed convolution, which can be regarded as Conv2d for the gradient of the input, also called deconvolution (although it is not an actual deconvolution). |
|
|
Calculates the 3D convolution on the input tensor. |
|
|
Calculates a 3D transposed convolution, which can be regarded as Conv3d for the gradient of the input. |
|
|
Extracts patches from images. |
|
Recurrent Neural Network Layer
API Name |
Description |
Supported Platforms |
Stacked Elman RNN layers. |
|
|
An Elman RNN cell with tanh or ReLU non-linearity. |
|
|
Stacked GRU (Gated Recurrent Unit) layers. |
|
|
A GRU(Gated Recurrent Unit) cell. |
|
|
Stacked LSTM (Long Short-Term Memory) layers. |
|
|
A LSTM (Long Short-Term Memory) cell. |
|
Embedding Layer
API Name |
Description |
Supported Platforms |
A simple lookup table that stores embeddings of a fixed dictionary and size. |
|
|
EmbeddingLookup layer. |
|
|
Returns a slice of input tensor based on the specified indices and the field ids. |
|
Nonlinear Activation Function Layer
API Name |
Description |
Supported Platforms |
Continuously differentiable exponential linear units activation function. |
|
|
Exponential Linear Unit activation function. |
|
|
Fast Gaussian error linear unit activation function. |
|
|
Gaussian error linear unit activation function. |
|
|
Applies the gated linear unit function. |
|
|
Gets the activation function. |
|
|
Applies the Hardtanh function element-wise. |
|
|
Hard Shrink activation function. |
|
|
Hard sigmoid activation function. |
|
|
Applies hswish-type activation element-wise. |
|
|
Leaky ReLU activation function. |
|
|
Applies logsigmoid activation element-wise. |
|
|
Applies the LogSoftmax function to n-dimensional input tensor. |
|
|
Local Response Normalization. |
|
|
Computes MISH(A Self Regularized Non-Monotonic Neural Activation Function) of input tensors element-wise. |
|
|
Softsign activation function. |
|
|
PReLU activation function. |
|
|
Rectified Linear Unit activation function. |
|
|
Compute ReLU6 activation function. |
|
|
Randomized Leaky ReLU activation function. |
|
|
Activation function SeLU (Scaled exponential Linear Unit). |
|
|
Sigmoid Linear Unit activation function. |
|
|
Sigmoid activation function. |
|
|
Softmin activation function, which is a two-category function |
|
|
Softmax activation function, which is a two-category function |
|
|
Applies SoftMax over features to each spatial location. |
|
|
Applies the SoftShrink function element-wise. |
|
|
Applies the Tanh function element-wise, returns a new tensor with the hyperbolic tangent of the elements of input, The input is a Tensor with any valid shape. |
|
|
Tanhshrink activation function. |
|
|
Thresholds each element of the input Tensor. |
|
Linear Layer
API Name |
Description |
Supported Platforms |
The dense connected layer. |
|
|
The bilinear dense connected layer. |
|
Dropout Layer
API Name |
Description |
Supported Platforms |
Dropout layer for the input. |
|
|
During training, randomly zeroes entire channels of the input tensor with probability p from a Bernoulli distribution (For a 3-dimensional tensor with a shape of \(NCL\), the channel feature map refers to a 1-dimensional feature map with the shape of \(L\)). |
|
|
During training, randomly zeroes some channels of the input tensor with probability p from a Bernoulli distribution (For a 4-dimensional tensor with a shape of \(NCHW\), the channel feature map refers to a 2-dimensional feature map with the shape of \(HW\)). |
|
|
During training, randomly zeroes some channels of the input tensor with probability p from a Bernoulli distribution (For a 5-dimensional tensor with a shape of \(NCDHW\), the channel feature map refers to a 3-dimensional feature map with a shape of \(DHW\)). |
|
Normalization Layer
API Name |
Description |
Supported Platforms |
This layer applies Batch Normalization over a 2D input (a mini-batch of 1D inputs) to reduce internal covariate shift. |
|
|
Batch Normalization is widely used in convolutional networks. |
|
|
Batch Normalization is widely used in convolutional networks. |
|
|
Group Normalization over a mini-batch of inputs. |
|
|
This layer applies Instance Normalization over a 3D input (a mini-batch of 1D inputs with additional channel dimension). |
|
|
This layer applies Instance Normalization over a 4D input (a mini-batch of 2D inputs with additional channel dimension). |
|
|
This layer applies Instance Normalization over a 5D input (a mini-batch of 3D inputs with additional channel dimension). |
|
|
Applies Layer Normalization over a mini-batch of inputs. |
|
|
Sync Batch Normalization layer over a N-dimension input. |
|
Pooling Layer
API Name |
Description |
Supported Platforms |
Applies a 1D adaptive average pooling over an input Tensor which can be regarded as a composition of 1D input planes. |
|
|
This operator applies a 2D adaptive average pooling to an input signal composed of multiple input planes. |
|
|
This operator applies a 3D adaptive average pooling to an input signal composed of multiple input planes. |
|
|
Applies a 1D adaptive maximum pooling over an input Tensor which can be regarded as a composition of 1D input planes. |
|
|
This operator applies a 2D adaptive max pooling to an input signal composed of multiple input planes. |
|
|
Applies a 3D adaptive max pooling over an input signal composed of several input planes. |
|
|
Applies a 1D average pooling over an input Tensor which can be regarded as a composition of 1D input planes. |
|
|
Applies a 2D average pooling over an input Tensor which can be regarded as a composition of 2D input planes. |
|
|
Applies a 3D average pooling over an input Tensor which can be regarded as a composition of 3D input planes. |
|
|
Applies a 2D fractional max pooling to an input signal composed of multiple input planes. |
|
|
This operator applies a 3D fractional max pooling over an input signal composed of several input planes. |
|
|
Applies a 1D power lp pooling over an input signal composed of several input planes. |
|
|
Applies a 2D power lp pooling over an input signal composed of several input planes. |
|
|
Applies a 1D max pooling over an input Tensor which can be regarded as a composition of 1D planes. |
|
|
Applies a 2D max pooling over an input Tensor which can be regarded as a composition of 2D planes. |
|
|
3D max pooling operation. |
|
|
Computes a partial inverse of MaxPool1d. |
|
|
Computes a partial inverse of MaxPool2d. |
|
|
Computes a partial inverse of MaxPool3d. |
|
Padding Layer
API Name |
Description |
Supported Platforms |
Pads the input tensor according to the paddings and mode. |
|
|
Using a given constant value to pads the last dimensions of input tensor. |
|
|
Using a given constant value to pads the last two dimensions of input tensor. |
|
|
Using a given constant value to pads the last three dimensions of input tensor. |
|
|
Using a given padding to do reflection pad on the given tensor. |
|
|
Using a given padding to do reflection pad the given tensor. |
|
|
Pad on W dimension of input x according to padding. |
|
|
Pad on HW dimension of input x according to padding. |
|
|
Pad on DHW dimension of input x according to padding. |
|
|
Pads the last two dimensions of input tensor with zero. |
|
Loss Function
API Name |
Description |
Supported Platforms |
BCELoss creates a criterion to measure the binary cross entropy between the true labels and predicted labels. |
|
|
Adds sigmoid activation function to input logits, and uses the given logits to compute binary cross entropy between the logits and the labels. |
|
|
CosineEmbeddingLoss creates a criterion to measure the similarity between two tensors using cosine distance. |
|
|
The cross entropy loss between input and target. |
|
|
Calculates the CTC (Connectionist Temporal Classification) loss. |
|
|
The Dice coefficient is a set similarity loss, which is used to calculate the similarity between two samples. |
|
|
It is a loss function to solve the imbalance of categories and the difference of classification difficulty. |
|
|
Gaussian negative log likelihood loss. |
|
|
Hinge Embedding Loss. |
|
|
HuberLoss calculate the error between the predicted value and the target value. |
|
|
Computes the Kullback-Leibler divergence between the logits and the labels. |
|
|
L1Loss is used to calculate the mean absolute error between the predicted value and the target value. |
|
|
MarginRankingLoss creates a criterion that measures the loss. |
|
|
Calculates the mean squared error between the predicted value and the label value. |
|
|
When there are multiple classifications, label is transformed into multiple binary classifications by one hot. |
|
|
Gets the negative log likelihood loss between logits and labels. |
|
|
RMSELoss creates a criterion to measure the root mean square error between \(x\) and \(y\) element-wise, where \(x\) is the input and \(y\) is the labels. |
|
|
Computes the sampled softmax training loss. |
|
|
SmoothL1 loss function, if the absolute error element-wise between the predicted value and the target value is less than the set threshold beta, the square term is used, otherwise the absolute error term is used. |
|
|
A loss class for two-class classification problems. |
|
|
Computes softmax cross entropy between logits and labels. |
|
Optimizer
API Name |
Description |
Supported Platforms |
Implements the Adadelta algorithm. |
|
|
Implements the Adagrad algorithm. |
|
|
Implements the Adaptive Moment Estimation (Adam) algorithm. |
|
|
Implements the AdaMax algorithm, a variant of Adaptive Movement Estimation (Adam) based on the infinity norm. |
|
|
This optimizer will offload Adam optimizer to host CPU and keep parameters being updated on the device, to minimize the memory cost. |
|
|
Implements the Adam algorithm with weight decay. |
|
|
Enable the adasum in "auto_parallel/semi_auto_parallel" mode. |
|
|
Enable the adasum in "auto_parallel/semi_auto_parallel" mode. |
|
|
Implements Average Stochastic Gradient Descent. |
|
|
Implements the FTRL algorithm. |
|
|
Implements the Lamb(Layer-wise Adaptive Moments optimizer for Batching training) algorithm. |
|
|
Implements the LARS algorithm. |
|
|
Implements the Adaptive Moment Estimation (Adam) algorithm. |
|
|
Implements the Momentum algorithm. |
|
|
Implements the ProximalAdagrad algorithm. |
|
|
Implements Root Mean Squared Propagation (RMSProp) algorithm. |
|
|
Implements Resilient backpropagation. |
|
|
Implements stochastic gradient descent. |
|
|
Updates gradients by second-order algorithm--THOR. |
|
Dynamic Learning Rate
LearningRateSchedule Class
The dynamic learning rates in this module are all subclasses of LearningRateSchedule. Pass the instance of LearningRateSchedule to an optimizer. During the training process, the optimizer calls the instance taking current step as input to get the current learning rate.
import mindspore.nn as nn
min_lr = 0.01
max_lr = 0.1
decay_steps = 4
cosine_decay_lr = nn.CosineDecayLR(min_lr, max_lr, decay_steps)
net = Net()
optim = nn.Momentum(net.trainable_params(), learning_rate=cosine_decay_lr, momentum=0.9)
API Name |
Description |
Supported Platforms |
Calculates learning rate based on cosine decay function. |
|
|
Calculates learning rate based on exponential decay function. |
|
|
Calculates learning rate base on inverse-time decay function. |
|
|
Calculates learning rate base on natural exponential decay function. |
|
|
Calculates learning rate base on polynomial decay function. |
|
|
Gets learning rate warming up. |
|
Dynamic LR Function
The dynamic learning rates in this module are all functions. Call the function and pass the result to an optimizer. During the training process, the optimizer takes result[current step] as current learning rate.
import mindspore.nn as nn
min_lr = 0.01
max_lr = 0.1
total_step = 6
step_per_epoch = 1
decay_epoch = 4
lr= nn.cosine_decay_lr(min_lr, max_lr, total_step, step_per_epoch, decay_epoch)
net = Net()
optim = nn.Momentum(net.trainable_params(), learning_rate=lr, momentum=0.9)
API Name |
Description |
Supported Platforms |
Calculates learning rate base on cosine decay function. |
|
|
Calculates learning rate base on exponential decay function. |
|
|
Calculates learning rate base on inverse-time decay function. |
|
|
Calculates learning rate base on natural exponential decay function. |
|
|
Get piecewise constant learning rate. |
|
|
Calculates learning rate base on polynomial decay function. |
|
|
Gets learning rate warming up. |
|
Image Processing Layer
API Name |
Description |
Supported Platforms |
Applies a pixelshuffle operation over an input signal composed of several input planes. |
|
|
Applies a pixelunshuffle operation over an input signal composed of several input planes. |
|
|
Samples the input tensor to the given size or scale_factor by using bilinear interpolate. |
|
Tools
API Name |
Description |
Supported Platforms |
Divide the channels in a tensor of shape \((*, C , H, W)\) into g groups and rearrange them as \((*, C \frac g, g, H, W)\), while keeping the original tensor shape. |
|
|
Flatten the dimensions other than the 0th dimension of the input Tensor. |
|
Mathematical Operations
API Name |
Description |
Supported Platforms |
Calculate the mean and variance of the input x along the specified axis. |
|