mindspore.nn
Neural Network Cell
For building predefined building blocks or computational units in neural networks.
Compared with the previous version, the added, deleted and supported platforms change information of mindspore.nn operators in MindSpore, please refer to the link API Updates .
Basic Block
API Name |
Description |
Supported Platforms |
The basic building block of neural networks in MindSpore. |
|
|
Base class for running the graph loaded from a MindIR. |
|
|
Base class for other losses. |
|
|
Base class for updating parameters. |
|
Container
API Name |
Description |
Supported Platforms |
Holds Cells in a list. |
|
|
Sequential Cell container. |
|
Wrapper Layer
API Name |
Description |
Supported Platforms |
A distributed optimizer. |
|
|
Dynamic Loss scale update cell. |
|
|
Update cell with fixed loss scaling value. |
|
|
Encapsulate training network. |
|
|
Cell to run for getting the next operation. |
|
|
Wrap the network with Batch Size. |
|
|
Cell that updates parameter. |
|
|
Wrap the network with Micro Batch. |
|
|
The time distributed layer. |
|
|
Network training package class. |
|
|
Network training with loss scaling. |
|
|
Wraps the forward network with the loss function. |
|
|
Cell that returns the gradients. |
|
|
Cell with loss function. |
|
Convolutional Neural Network Layer
API Name |
Description |
Supported Platforms |
1D convolution layer. |
|
|
1D transposed convolution layer. |
|
|
2D convolution layer. |
|
|
2D transposed convolution layer. |
|
|
3D convolution layer. |
|
|
3D transposed convolution layer. |
|
|
Extracts patches from images. |
|
Recurrent Neural Network Layer
API Name |
Description |
Supported Platforms |
Stacked Elman RNN layers. |
|
|
An Elman RNN cell with tanh or ReLU non-linearity. |
|
|
Stacked GRU (Gated Recurrent Unit) layers. |
|
|
A GRU(Gated Recurrent Unit) cell. |
|
|
Stacked LSTM (Long Short-Term Memory) layers. |
|
|
A LSTM (Long Short-Term Memory) cell. |
|
Embedding Layer
API Name |
Description |
Supported Platforms |
A simple lookup table that stores embeddings of a fixed dictionary and size. |
|
|
EmbeddingLookup layer. |
|
|
Returns a slice of input tensor based on the specified indices and the field ids. |
|
Nonlinear Activation Function Layer
API Name |
Description |
Supported Platforms |
Continuously differentiable exponential linear units activation function. |
|
|
Exponential Linear Unit activation function. |
|
|
Fast Gaussian error linear unit activation function. |
|
|
Gaussian error linear unit activation function. |
|
|
Gets the activation function. |
|
|
Hardtanh activation function. |
|
|
Hard Shrink activation function. |
|
|
Hard sigmoid activation function. |
|
|
Hard swish activation function. |
|
|
Leaky ReLU activation function. |
|
|
Logsigmoid activation function. |
|
|
LogSoftmax activation function. |
|
|
Local Response Normalization. |
|
|
Computes MISH(A Self Regularized Non-Monotonic Neural Activation Function) of input tensors element-wise. |
|
|
Softsign activation function. |
|
|
PReLU activation function. |
|
|
Rectified Linear Unit activation function. |
|
|
Compute ReLU6 activation function. |
|
|
Randomized Leaky ReLU activation function. |
|
|
Activation function SeLU (Scaled exponential Linear Unit). |
|
|
Sigmoid Linear Unit activation function. |
|
|
Sigmoid activation function. |
|
|
Softmin activation function, which is a two-category function |
|
|
Softmax activation function, which is a two-category function |
|
|
Applies the SoftShrink function element-wise. |
|
|
Tanh activation function. |
|
|
Tanhshrink activation function. |
|
|
Thresholds each element of the input Tensor. |
|
Linear Layer
API Name |
Description |
Supported Platforms |
The dense connected layer. |
|
|
The bilinear dense connected layer. |
|
Dropout Layer
API Name |
Description |
Supported Platforms |
Dropout layer for the input. |
|
|
During training, randomly zeroes some channels of the input tensor with probability p from a Bernoulli distribution (For a 4-dimensional tensor with a shape of \(NCHW\), the channel feature map refers to a 2-dimensional feature map with the shape of \(HW\)). |
|
|
During training, randomly zeroes some channels of the input tensor with probability p from a Bernoulli distribution (For a 5-dimensional tensor with a shape of \(NCDHW\), the channel feature map refers to a 3-dimensional feature map with a shape of \(DHW\)). |
|
Normalization Layer
API Name |
Description |
Supported Platforms |
Batch Normalization layer over a 2D input. |
|
|
Batch Normalization layer over a 4D input. |
|
|
Batch Normalization layer over a 5D input. |
|
|
Group Normalization over a mini-batch of inputs. |
|
|
Instance Normalization layer over a 3D input. |
|
|
Instance Normalization layer over a 4D input. |
|
|
Instance Normalization layer over a 5D input. |
|
|
Applies Layer Normalization over a mini-batch of inputs. |
|
|
Sync Batch Normalization layer over a N-dimension input. |
|
Pooling Layer
API Name |
Description |
Supported Platforms |
1D adaptive average pooling for temporal data. |
|
|
2D adaptive average pooling for temporal data. |
|
|
3D adaptive average pooling for temporal data. |
|
|
1D adaptive maximum pooling for temporal data. |
|
|
AdaptiveMaxPool2d operation. |
|
|
1D average pooling for temporal data. |
|
|
2D average pooling for temporal data. |
|
|
1D max pooling operation for temporal data. |
|
|
2D max pooling operation for temporal data. |
|
Padding Layer
API Name |
Description |
Supported Platforms |
Pads the input tensor according to the paddings and mode. |
|
|
Using a given constant value to pads the last dimensions of input tensor. |
|
|
Using a given constant value to pads the last two dimensions of input tensor. |
|
|
Using a given constant value to pads the last three dimensions of input tensor. |
|
|
Using a given padding to do reflection pad on the given tensor. |
|
|
Using a given padding to do reflection pad the given tensor. |
|
|
Pads the last two dimensions of input tensor with zero. |
|
Loss Function
API Name |
Description |
Supported Platforms |
BCELoss creates a criterion to measure the binary cross entropy between the true labels and predicted labels. |
|
|
Adds sigmoid activation function to input logits, and uses the given logits to compute binary cross entropy between the logits and the labels. |
|
|
CosineEmbeddingLoss creates a criterion to measure the similarity between two tensors using cosine distance. |
|
|
The cross entropy loss between input and target. |
|
|
The Dice coefficient is a set similarity loss, which is used to calculate the similarity between two samples. |
|
|
It is a loss function to solve the imbalance of categories and the difference of classification difficulty. |
|
|
HuberLoss calculate the error between the predicted value and the target value. |
|
|
L1Loss is used to calculate the mean absolute error between the predicted value and the target value. |
|
|
Calculates the mean squared error between the predicted value and the label value. |
|
|
When there are multiple classifications, label is transformed into multiple binary classifications by one hot. |
|
|
Gets the negative log likelihood loss between logits and labels. |
|
|
RMSELoss creates a criterion to measure the root mean square error between \(x\) and \(y\) element-wise, where \(x\) is the input and \(y\) is the labels. |
|
|
Computes the sampled softmax training loss. |
|
|
SmoothL1 loss function, if the absolute error element-wise between the predicted value and the target value is less than the set threshold beta, the square term is used, otherwise the absolute error term is used. |
|
|
A loss class for two-class classification problems. |
|
|
Computes softmax cross entropy between logits and labels. |
|
Optimizer
API Name |
Description |
Supported Platforms |
Implements the Adadelta algorithm. |
|
|
Implements the Adagrad algorithm. |
|
|
Implements the Adaptive Moment Estimation (Adam) algorithm. |
|
|
Implements the AdaMax algorithm, a variant of Adaptive Movement Estimation (Adam) based on the infinity norm. |
|
|
This optimizer will offload Adam optimizer to host CPU and keep parameters being updated on the device, to minimize the memory cost. |
|
|
Implements the Adam algorithm with weight decay. |
|
|
Enable the adasum in "auto_parallel/semi_auto_parallel" mode. |
|
|
Enable the adasum in "auto_parallel/semi_auto_parallel" mode. |
|
|
Implements Average Stochastic Gradient Descent. |
|
|
Implements the FTRL algorithm. |
|
|
Implements the Lamb(Layer-wise Adaptive Moments optimizer for Batching training) algorithm. |
|
|
Implements the LARS algorithm. |
|
|
Implements the Adaptive Moment Estimation (Adam) algorithm. |
|
|
Implements the Momentum algorithm. |
|
|
Implements the ProximalAdagrad algorithm. |
|
|
Implements Root Mean Squared Propagation (RMSProp) algorithm. |
|
|
Implements Resilient backpropagation. |
|
|
Implements stochastic gradient descent. |
|
|
Updates gradients by second-order algorithm--THOR. |
|
Evaluation Metrics
API Name |
Description |
Supported Platforms |
Calculates the accuracy for classification and multilabel data. |
|
|
Computes the AUC(Area Under the Curve) using the trapezoidal rule. |
|
|
Calculates the BLEU score. |
|
|
Computes the confusion matrix, which is commonly used to evaluate the performance of classification models, including binary classification and multiple classification. |
|
|
Computes metrics related to confusion matrix. |
|
|
Computes representation similarity. |
|
|
The Dice coefficient is a set similarity metric. |
|
|
Calculates the F1 score. |
|
|
Calculates the Fbeta score. |
|
|
Calculates the Hausdorff distance. |
|
|
Gets the metric method based on the input name. |
|
|
Calculates the average of the loss. |
|
|
Calculates the mean absolute error(MAE). |
|
|
Computes the Average Surface Distance from y_pred to y under the default setting. |
|
|
Base class of metric, which is used to evaluate metrics. |
|
|
Measures the mean squared error(MSE). |
|
|
Gets all names of the metric methods. |
|
|
Calculates the occlusion sensitivity of the model for a given image, which illustrates which parts of an image are most important for a network's classification. |
|
|
Computes perplexity. |
|
|
Calculates precision for classification and multilabel data. |
|
|
Calculates recall for classification and multilabel data. |
|
|
Calculates the ROC curve. |
|
|
Computes the Root Mean Square Surface Distance from y_pred to y under the default setting. |
|
|
This decorator is used to rearrange the inputs according to its indexes attribute of the class. |
|
|
Calculates the top-1 categorical accuracy. |
|
|
Calculates the top-5 categorical accuracy. |
|
|
Calculates the top-k categorical accuracy. |
|
Dynamic Learning Rate
LearningRateSchedule Class
The dynamic learning rates in this module are all subclasses of LearningRateSchedule. Pass the instance of LearningRateSchedule to an optimizer. During the training process, the optimizer calls the instance taking current step as input to get the current learning rate.
import mindspore.nn as nn
min_lr = 0.01
max_lr = 0.1
decay_steps = 4
cosine_decay_lr = nn.CosineDecayLR(min_lr, max_lr, decay_steps)
net = Net()
optim = nn.Momentum(net.trainable_params(), learning_rate=cosine_decay_lr, momentum=0.9)
API Name |
Description |
Supported Platforms |
Calculates learning rate based on cosine decay function. |
|
|
Calculates learning rate based on exponential decay function. |
|
|
Calculates learning rate base on inverse-time decay function. |
|
|
Calculates learning rate base on natural exponential decay function. |
|
|
Calculates learning rate base on polynomial decay function. |
|
|
Gets learning rate warming up. |
|
Dynamic LR Function
The dynamic learning rates in this module are all functions. Call the function and pass the result to an optimizer. During the training process, the optimizer takes result[current step] as current learning rate.
import mindspore.nn as nn
min_lr = 0.01
max_lr = 0.1
total_step = 6
step_per_epoch = 1
decay_epoch = 4
lr= nn.cosine_decay_lr(min_lr, max_lr, total_step, step_per_epoch, decay_epoch)
net = Net()
optim = nn.Momentum(net.trainable_params(), learning_rate=lr, momentum=0.9)
API Name |
Description |
Supported Platforms |
Calculates learning rate base on cosine decay function. |
|
|
Calculates learning rate base on exponential decay function. |
|
|
Calculates learning rate base on inverse-time decay function. |
|
|
Calculates learning rate base on natural exponential decay function. |
|
|
Get piecewise constant learning rate. |
|
|
Calculates learning rate base on polynomial decay function. |
|
|
Gets learning rate warming up. |
|
Image Processing Layer
API Name |
Description |
Supported Platforms |
Crops the central region of the images with the central_fraction. |
|
|
Returns two tensors, the first is along the height dimension and the second is along the width dimension. |
|
|
Returns MS-SSIM index between two images. |
|
|
Returns Peak Signal-to-Noise Ratio of two image batches. |
|
|
Samples the input tensor to the given size or scale_factor by using bilinear interpolate. |
|
|
Returns SSIM index between two images. |
|
Matrix Processing
API Name |
Description |
Supported Platforms |
Returns a batched diagonal tensor with a given batched diagonal values. |
|
|
Returns the batched diagonal part of a batched tensor. |
|
|
Modifies the batched diagonal part of a batched tensor. |
|
Tools
API Name |
Description |
Supported Platforms |
Clips tensor values to a maximum \(L_2\)-norm. |
|
|
Flatten the dimensions other than the 0th dimension of the input Tensor. |
|
|
Applies l1 regularization to weights. |
|
|
Computes the norm of vectors, currently including Euclidean norm, i.e., \(L_2\)-norm. |
|
|
Returns a one-hot tensor. |
|
|
Creates a sequence of numbers in range [start, limit) with step size delta. |
|
|
Rolls the elements of a tensor along an axis. |
|
|
Returns a tensor, the elements above the specified main diagonal are set to zero. |
|
|
Returns a tensor with elements below the kth diagonal zeroed. |
|
Mathematical Operations
API Name |
Description |
Supported Platforms |
Calculate the mean and variance of the input x along the specified axis. |
|