Common Network Components

Overview

MindSpore encapsulates some common network components for network training, inference, gradient calculation, and data processing.

These network components can be directly used by users and are also used in more advanced encapsulation APIs such as model.train and model.eval.

The following describes three network components, GradOperation, WithLossCell, and TrainOneStepCell, in terms of functions, usage, and internal use.

GradOperation

GradOperation is used to generate the gradient of the input function. The get_all, get_by_list, and sens_param parameters are used to control the gradient calculation method. For details, see MindSpore API The following is an example of using GradOperation:

import numpy as np
import mindspore.nn as nn
from mindspore import Tensor, Parameter
from mindspore import dtype as mstype
import mindspore.ops as ops


class Net(nn.Cell):
    def __init__(self):
        super(Net, self).__init__()
        self.matmul = ops.MatMul()
        self.z = Parameter(Tensor(np.array([1.0], np.float32)), name='z')
    def construct(self, x, y):
        x = x * self.z
        out = self.matmul(x, y)
        return out

class GradNetWrtX(nn.Cell):
    def __init__(self, net):
        super(GradNetWrtX, self).__init__()
        self.net = net
        self.grad_op = ops.GradOperation()
    def construct(self, x, y):
        gradient_function = self.grad_op(self.net)
        return gradient_function(x, y)

x = Tensor([[0.5, 0.6, 0.4], [1.2, 1.3, 1.1]], dtype=mstype.float32)
y = Tensor([[0.01, 0.3, 1.1], [0.1, 0.2, 1.3], [2.1, 1.2, 3.3]], dtype=mstype.float32)
GradNetWrtX(Net())(x, y)

Tensor(shape=[2, 3], dtype=Float32, value=
[[1.41000009e+000, 1.60000002e+000, 6.59999943e+000],
 [1.41000009e+000, 1.60000002e+000, 6.59999943e+000]])

The preceding example is used to calculate the gradient value of Net to x. You need to define the network Net as the input of GradOperation. The instance creates GradNetWrtX that contains the gradient operation. Calling GradNetWrtX transfers the network to GradOperation to generate a gradient function, and transfers the input data to the gradient function to return the final result.

All other components, such as WithGradCell and TrainOneStepCell, involved in gradient calculation use GradOperation. You can view the internal implementation of these APIs.

WithLossCell

WithLossCell is essentially a Cell that contains the loss function. To build WithLossCell, you need to define the network and loss function in advance.

The following uses an example to describe how to use this function. First, you need to build a network. The content is as follows:

import numpy as np

import mindspore.context as context
import mindspore.nn as nn
from mindspore import Tensor
from mindspore.nn import TrainOneStepCell, WithLossCell
from mindspore.nn.optim import Momentum
import mindspore.ops as ops

context.set_context(mode=context.GRAPH_MODE, device_target="GPU")


class LeNet(nn.Cell):
    def __init__(self):
        super(LeNet, self).__init__()
        self.relu = ops.ReLU()
        self.batch_size = 32

        self.conv1 = nn.Conv2d(1, 6, kernel_size=5, stride=1, padding=0, has_bias=False, pad_mode='valid')
        self.conv2 = nn.Conv2d(6, 16, kernel_size=5, stride=1, padding=0, has_bias=False, pad_mode='valid')
        self.pool = nn.MaxPool2d(kernel_size=2, stride=2)
        self.reshape = ops.Reshape()
        self.fc1 = nn.Dense(400, 120)
        self.fc2 = nn.Dense(120, 84)
        self.fc3 = nn.Dense(84, 10)

    def construct(self, input_x):
        output = self.conv1(input_x)
        output = self.relu(output)
        output = self.pool(output)
        output = self.conv2(output)
        output = self.relu(output)
        output = self.pool(output)
        output = self.reshape(output, (self.batch_size, -1))
        output = self.fc1(output)
        output = self.relu(output)
        output = self.fc2(output)
        output = self.relu(output)
        output = self.fc3(output)
        return output

The following is an example of using WithLossCell. Define the network and loss functions, create a WithLossCell, and input the input data and label data. WithLossCell returns the calculation result based on the network and loss functions.

data = Tensor(np.ones([32, 1, 32, 32]).astype(np.float32) * 0.01)
label = Tensor(np.ones([32]).astype(np.int32))
net = LeNet()
criterion = nn.SoftmaxCrossEntropyWithLogits(sparse=True, reduction='mean')
net_with_criterion = WithLossCell(net, criterion)
loss = net_with_criterion(data, label)
print("+++++++++Loss+++++++++++++")
print(loss)

The following information is displayed:

+++++++++Loss+++++++++++++
2.302585

TrainOneStepCell

TrainOneStepCell is used to perform single-step training of the network and return the loss result after each training result.

The following describes how to build an instance for using the TrainOneStepCell API to perform network training. The import code of the LeNet and package name is the same as that in the previous case.

data = Tensor(np.ones([32, 1, 32, 32]).astype(np.float32) * 0.01)
label = Tensor(np.ones([32]).astype(np.int32))
net = LeNet()
learning_rate = 0.01
momentum = 0.9

optimizer = Momentum(filter(lambda x: x.requires_grad, net.get_parameters()), learning_rate, momentum)
criterion = nn.SoftmaxCrossEntropyWithLogits(sparse=True, reduction='mean')
net_with_criterion = WithLossCell(net, criterion)
train_network = TrainOneStepCell(net_with_criterion, optimizer)  # optimizer
for i in range(5):
    train_network.set_train()
    res = train_network(data, label)
    print(f"+++++++++result:{i}++++++++++++")
    print(res)

+++++++++result:0++++++++++++
2.302585
+++++++++result:1++++++++++++
2.2935712
+++++++++result:2++++++++++++
2.2764661
+++++++++result:3++++++++++++
2.2521412
+++++++++result:4++++++++++++
2.2214084

In the case, an optimizer and a WithLossCell instance are built, and then a training network is initialized in TrainOneStepCell. The case is repeated for five times, that is, the network is trained for five times, and the loss result of each time is output, the result shows that the loss value gradually decreases after each training.

The following content will describe how MindSpore uses more advanced encapsulation APIs, that is, the train method in the Model class to train a model. Many network components, such as TrainOneStepCell and WithLossCell, will be used in the internal implementation. You can view the internal implementation of these components.