mindspore.nn
Neural Networks Cells.
Pre-defined building blocks or computing units to construct neural networks.
Compared with the previous version, the added, deleted and supported platforms change information of mindspore.nn operators in MindSpore, please refer to the link https://gitee.com/mindspore/docs/blob/r1.5/resource/api_updates/nn_api_updates.md.
Cell
API Name |
Description |
Supported Platforms |
Base class for all neural networks. |
|
|
Base class for running the graph loaded from a MindIR. |
|
|
Base class for GraphKernel. |
|
Containers
API Name |
Description |
Supported Platforms |
Holds Cells in a list. |
|
|
Sequential cell container. |
|
Convolution Layers
API Name |
Description |
Supported Platforms |
1D convolution layer. |
|
|
1D transposed convolution layer. |
|
|
2D convolution layer. |
|
|
2D transposed convolution layer. |
|
|
3D convolution layer. |
|
|
Compute a 3D transposed convolution, which is also known as a deconvolution (although it is not an actual deconvolution). |
|
Gradient
API Name |
Description |
Supported Platforms |
Compute the jacobian-vector-product of the given network. |
To Be Developed |
|
Computes the dot product between a vector v and the Jacobian of the given network at the point given by the inputs. |
To Be Developed |
Recurrent Layers
API Name |
Description |
Supported Platforms |
A GRU(Gated Recurrent Unit) cell. |
|
|
Stacked GRU (Gated Recurrent Unit) layers. |
|
|
LSTM (Long Short-Term Memory) layer. |
|
|
Stacked LSTM (Long Short-Term Memory) layers. |
|
|
An Elman RNN cell with tanh or ReLU non-linearity. |
|
|
Stacked Elman RNN layers. |
|
Sparse Layers
API Name |
Description |
Supported Platforms |
A simple lookup table that stores embeddings of a fixed dictionary and size. |
|
|
Returns a slice of the input tensor based on the specified indices. |
|
|
Returns a slice of input tensor based on the specified indices and the field ids. |
|
|
Converts a sparse tensor into dense. |
|
|
Multiplies sparse matrix a and dense matrix b. |
|
Non-linear Activations
API Name |
Description |
Supported Platforms |
Exponential Linear Uint activation function. |
|
|
Fast Gaussian error linear unit activation function. |
|
|
Gaussian error linear unit activation function. |
|
|
Gets the activation function. |
|
|
Applies the hard shrinkage function element-wise, each element complies the follow function: |
|
|
Hard sigmoid activation function. |
|
|
Hard swish activation function. |
|
|
Leaky ReLU activation function. |
|
|
Logsigmoid activation function. |
|
|
LogSoftmax activation function. |
|
|
PReLU activation function. |
|
|
Rectified Linear Unit activation function. |
|
|
Compute ReLU6 activation function. |
|
|
Sigmoid activation function. |
|
|
Softmax activation function. |
|
|
Applies the soft shrinkage function elementwise. |
|
|
Tanh activation function. |
|
Utilities
API Name |
Description |
Supported Platforms |
Clips tensor values to a maximum \(L_2\)-norm. |
|
|
The dense connected layer. |
|
|
Dropout layer for the input. |
|
|
Flatten layer for the input. |
|
|
Applies l1 regularization to weights. |
|
|
Computes the norm of vectors, currently including Euclidean norm, i.e., \(L_2\)-norm. |
|
|
Returns a one-hot tensor. |
|
|
Pads the input tensor according to the paddings and mode. |
|
|
Creates a sequence of numbers in range [start, limit) with step size delta. |
|
|
Samples the input tensor to the given size or scale_factor by using bilinear interpolate. |
|
|
Rolls the elements of a tensor along an axis. |
|
|
Returns a tensor with elements above the kth diagonal zeroed. |
|
|
Returns a tensor with elements below the kth diagonal zeroed. |
|
|
Extracts patches from images. |
|
Images Functions
API Name |
Description |
Supported Platforms |
Crops the central region of the images with the central_fraction. |
|
|
Returns two tensors, the first is along the height dimension and the second is along the width dimension. |
|
|
Returns MS-SSIM index between two images. |
|
|
Returns Peak Signal-to-Noise Ratio of two image batches. |
|
|
Returns SSIM index between two images. |
|
Normalization Layers
API Name |
Description |
Supported Platforms |
Batch Normalization layer over a 2D input. |
|
|
Batch Normalization layer over a 4D input. |
|
|
Batch Normalization layer over a 5D input. |
|
|
Global Batch Normalization layer over a N-dimension input. |
|
|
Group Normalization over a mini-batch of inputs. |
|
|
Instance Normalization layer over a 4D input. |
|
|
Applies Layer Normalization over a mini-batch of inputs. |
|
|
Returns a batched diagonal tensor with a given batched diagonal values. |
|
|
Returns the batched diagonal part of a batched tensor. |
|
|
Modifies the batched diagonal part of a batched tensor. |
|
|
Sync Batch Normalization layer over a N-dimension input. |
|
Pooling layers
API Name |
Description |
Supported Platforms |
1D average pooling for temporal data. |
|
|
2D average pooling for temporal data. |
|
|
1D max pooling operation for temporal data. |
|
|
2D max pooling operation for temporal data. |
|
Quantized Functions
API Name |
Description |
Supported Platforms |
Quantization aware training activation function. |
|
|
A combination of convolution, Batchnorm, and activation layer. |
|
|
2D convolution with Batch Normalization operation folded construct. |
|
|
2D convolution which use the convolution layer statistics once to calculate Batch Normalization operation folded construct. |
|
|
2D convolution and batchnorm without fold with fake quantized construct. |
|
|
2D convolution with fake quantized operation layer. |
|
|
A combination of Dense, Batchnorm, and the activation layer. |
|
|
The fully connected layer with fake quantized operation. |
|
|
Quantization aware operation which provides the fake quantization observer function on data with min and max. |
|
|
Adds fake quantized operation after Mul operation. |
|
|
Adds fake quantized operation after TensorAdd operation. |
|
Loss Functions
API Name |
Description |
Supported Platforms |
BCELoss creates a criterion to measure the binary cross entropy between the true labels and predicted labels. |
|
|
Adds sigmoid activation function to input logits, and uses the given logits to compute binary cross entropy between the logits and the labels. |
|
|
CosineEmbeddingLoss creates a criterion to measure the similarity between two tensors using cosine distance. |
|
|
The Dice coefficient is a set similarity loss. |
|
|
The loss function proposed by Kaiming team in their paper |
|
|
L1Loss creates a criterion to measure the mean absolute error (MAE) between \(x\) and \(y\) element-wise, where \(x\) is the input Tensor and \(y\) is the labels Tensor. |
|
|
Base class for other losses. |
|
|
MAELoss creates a criterion to measure the average absolute error between \(x\) and \(y\) element-wise, where \(x\) is the input and \(y\) is the labels. |
|
|
MSELoss creates a criterion to measure the mean squared error (squared L2-norm) between \(x\) and \(y\) element-wise, where \(x\) is the input and \(y\) is the labels. |
|
|
When there are multiple classifications, label is transformed into multiple binary classifications by one hot. |
|
|
RMSELoss creates a criterion to measure the root mean square error between \(x\) and \(y\) element-wise, where \(x\) is the input and \(y\) is the labels. |
|
|
Computes the sampled softmax training loss. |
|
|
A loss class for learning region proposals. |
|
|
A loss class for two-class classification problems. |
|
|
Computes softmax cross entropy between logits and labels. |
|
Optimizer Functions
API Name |
Description |
Supported Platforms |
Implements the Adagrad algorithm with ApplyAdagrad Operator. |
|
|
Updates gradients by the Adaptive Moment Estimation (Adam) algorithm. |
|
|
This optimizer will offload Adam optimizer to host CPU and keep parameters being updated on the device, to minimize the memory cost. |
|
|
Implements the Adam algorithm to fix the weight decay. |
|
|
Implements the FTRL algorithm with ApplyFtrl Operator. |
|
|
Lamb(Layer-wise Adaptive Moments optimizer for Batching training) Dynamic Learning Rate. |
|
|
Implements the LARS algorithm with LARSUpdate Operator. |
|
|
This optimizer will apply a lazy adam algorithm when gradient is sparse. |
|
|
Implements the Momentum algorithm. |
|
|
Base class for all optimizers. |
|
|
Implements the ProximalAdagrad algorithm with ApplyProximalAdagrad Operator. |
|
|
Implements Root Mean Squared Propagation (RMSProp) algorithm. |
|
|
Implements stochastic gradient descent. |
|
|
Updates gradients by second-order algorithm--THOR. |
|
Wrapper Functions
API Name |
Description |
Supported Platforms |
A distributed optimizer. |
|
|
Dynamic Loss scale update cell. |
|
|
Static scale update cell, the loss scaling value will not be updated. |
|
|
Network training package class. |
|
|
Cell to run for getting the next operation. |
|
|
Cell that updates parameter. |
|
|
Wrap the network with Micro Batch. |
To Be Developed |
|
The time distributed layer. |
|
|
Network training package class. |
|
|
Network training with loss scaling. |
|
|
Cell that returns loss, output and label for evaluation. |
|
|
Cell that returns the gradients. |
|
|
Cell with loss function. |
|
Math Functions
API Name |
Description |
Supported Platforms |
Multiplies matrix x1 by matrix x2. |
|
|
Calculates the mean and variance of x. |
|
|
Reduces a dimension of a tensor by calculating exponential for all elements in the dimension, then calculate logarithm of the sum. |
|
Metrics
Calculates the accuracy for classification and multilabel data. |
|
Computes the AUC(Area Under the Curve) using the trapezoidal rule. |
|
Calculates BLEU score of machine translated text with one or more references. |
|
Computes the confusion matrix. |
|
The performance matrix of measurement classification model is the model whose output is binary or multi class. |
|
Computes representation similarity |
|
The Dice coefficient is a set similarity metric. |
|
Calculates the F1 score. |
|
Calculates the fbeta score. |
|
Calculates the Hausdorff distance. |
|
Gets the metric method based on the input name. |
|
Calculates the average of the loss. |
|
Calculates the mean absolute error(MAE). |
|
This function is used to compute the Average Surface Distance from y_pred to y under the default setting. |
|
Base class of metric. |
|
Measures the mean squared error(MSE). |
|
Gets the names of the metric methods. |
|
This function is used to calculate the occlusion sensitivity of the model for a given image. |
|
Computes perplexity. |
|
Calculates precision for classification and multilabel data. |
|
Calculates recall for classification and multilabel data. |
|
Calculates the ROC curve. |
|
This function is used to compute the Residual Mean Square Distance from y_pred to y under the default setting. |
|
This decorator is used to rearrange the inputs according to its _indexes attributes which is specified by the set_indexes method. |
|
Calculates the top-1 categorical accuracy. |
|
Calculates the top-5 categorical accuracy. |
|
Calculates the top-k categorical accuracy. |
Dynamic Learning Rate
LearningRateSchedule
The dynamic learning rates in this module are all subclasses of LearningRateSchedule. Pass the instance of LearningRateSchedule to an optimizer. During the training process, the optimizer calls the instance taking current step as input to get the current learning rate.
import mindspore.nn as nn
min_lr = 0.01
max_lr = 0.1
decay_steps = 4
cosine_decay_lr = nn.CosineDecayLR(min_lr, max_lr, decay_steps)
net = Net()
optim = nn.Momentum(net.trainable_params(), learning_rate=cosine_decay_lr, momentum=0.9)
Calculates learning rate base on cosine decay function. |
|
Calculates learning rate base on exponential decay function. |
|
Calculates learning rate base on inverse-time decay function. |
|
Calculates learning rate base on natural exponential decay function. |
|
Calculates learning rate base on polynomial decay function. |
|
Gets learning rate warming up. |
Dynamic LR
The dynamic learning rates in this module are all functions. Call the function and pass the result to an optimizer. During the training process, the optimizer takes result[current step] as current learning rate.
import mindspore.nn as nn
min_lr = 0.01
max_lr = 0.1
total_step = 6
step_per_epoch = 1
decay_epoch = 4
lr= nn.cosine_decay_lr(min_lr, max_lr, total_step, step_per_epoch, decay_epoch)
net = Net()
optim = nn.Momentum(net.trainable_params(), learning_rate=lr, momentum=0.9)
Calculates learning rate base on cosine decay function. |
|
Calculates learning rate base on exponential decay function. |
|
Calculates learning rate base on inverse-time decay function. |
|
Calculates learning rate base on natural exponential decay function. |
|
Get piecewise constant learning rate. |
|
Calculates learning rate base on polynomial decay function. |
|
Gets learning rate warming up. |