Building a Neural Network

View Source On Gitee

A neural network model consists of multiple data operation layers. mindspore.nn provides various basic network modules. The following uses LeNet-5 as an example to first describe how to build a neural network model by using mindspore.nn, and then describes how to build a LeNet-5 model by using mindvision.classification.models.

mindvision.classification.models is a network model interface developed based on mindspore.nn, providing some classic and commonly used network models for the convenience of users.

LeNet-5 Model

LeNet-5 is a typical convolutional neural network proposed by professor Yann LeCun in 1998, which achieves 99.4% accuracy on the MNIST dataset and is the first classic in the field of CNN. The model structure is shown in the following figure:

LeNet-5

Except the input layer, LeNet contains seven layers: three convolutional layers, two subsampling layers, and two fully-connected layers.

Defining a Model Class

In the preceding figure, C indicates the convolutional layer layer, S indicates the sampling layer, and F indicates the fully-connected layer.

The input size of an image is fixed at \(32 \times 32\). To achieve a good convolution effect, the number must be in the center of the image. Therefore, the input \(32 \times 32\) is the result after the image is filled with \(28 \times 28\). Unlike the three-channel input images of the CNN network, the input images of LeNet are only normalized binary images. The output of the network is the prediction probability of digits 0 to 9, which can be understood as the probability that the input image belongs to digits 0 to 9.

The Cell class of MindSpore is the base class for building all networks and the basic unit of a network. When a neural network is required, you need to inherit the Cell class and overwrite the __init__ and construct methods.

import mindspore.nn as nn

class LeNet5(nn.Cell):
    """
    LeNet-5 network structure
    """
    def __init__(self, num_class=10, num_channel=1):
        super(LeNet5, self).__init__()
        # Convolutional layer, where the number of input channels is num_channel, the number of output channels is 6, and the convolutional kernel size is 5 x 5.
        self.conv1 = nn.Conv2d(num_channel, 6, 5, pad_mode='valid')
        # Convolutional layer, where the number of input channels is 6, the number of output channels is 16, and the convolutional kernel size is 5 x 5.
        self.conv2 = nn.Conv2d(6, 16, 5, pad_mode='valid')
        # Fully-connected layer, where the number of inputs is 16 x 5 x 5 and the number of outputs is 120.
        self.fc1 = nn.Dense(16 * 5 * 5, 120)
        # Fully-connected layer, where the number of inputs is 120 and the number of outputs is 84.
        self.fc2 = nn.Dense(120, 84)
        # Fully-connected layer, where the number of inputs is 84 and the number of classes is num_class.
        self.fc3 = nn.Dense(84, num_class)
        # ReLU activation function
        self.relu = nn.ReLU()
        # Pooling layer
        self.max_pool2d = nn.MaxPool2d(kernel_size=2, stride=2)
        # Multidimensional arrays are flattened into one-dimensional arrays.
        self.flatten = nn.Flatten()

    def construct(self, x):
        # Use the defined operation to build a forward network.
        x = self.conv1(x)
        x = self.relu(x)
        x = self.max_pool2d(x)
        x = self.conv2(x)
        x = self.relu(x)
        x = self.max_pool2d(x)
        x = self.flatten(x)
        x = self.fc1(x)
        x = self.relu(x)
        x = self.fc2(x)
        x = self.relu(x)
        x = self.fc3(x)
        return x

Next, build the neural network model defined above and look at the structure of the network model.

model = LeNet5()

print(model)
LeNet5<
  (conv1): Conv2d<input_channels=1, output_channels=6, kernel_size=(5, 5), stride=(1, 1), pad_mode=valid, padding=0, dilation=(1, 1), group=1, has_bias=False, weight_init=normal, bias_init=zeros, format=NCHW>
  (conv2): Conv2d<input_channels=6, output_channels=16, kernel_size=(5, 5), stride=(1, 1), pad_mode=valid, padding=0, dilation=(1, 1), group=1, has_bias=False, weight_init=normal, bias_init=zeros, format=NCHW>
  (fc1): Dense<input_channels=400, output_channels=120, has_bias=True>
  (fc2): Dense<input_channels=120, output_channels=84, has_bias=True>
  (fc3): Dense<input_channels=84, output_channels=10, has_bias=True>
  (relu): ReLU<>
  (max_pool2d): MaxPool2d<kernel_size=2, stride=2, pad_mode=VALID>
  (flatten): Flatten<>
>

Model Layers

The following describes the key member functions of the Cell class used in LeNet-5, and then describes how to use the Cell class to access model parameters through the instantiation network. For more information about the Cell class, see mindspore.nn interface.

nn.Conv2d

Add the nn.Conv2d layer and add a convolution function to the network to help the neural network extract features.

import numpy as np

import mindspore as ms

# The number of channels input is 1, the number of channels of output is 6, the convolutional kernel size is 5 x 5, and the parameters are initialized using the normal operator, and the pixels are not filled.
conv2d = nn.Conv2d(1, 6, 5, has_bias=False, weight_init='normal', pad_mode='same')
input_x = ms.Tensor(np.ones([1, 1, 32, 32]), ms.float32)

print(conv2d(input_x).shape)
(1, 6, 32, 32)

nn.ReLU

Add the nn.ReLU layer and add a non-linear activation function to the network to help the neural network learn various complex features.

import mindspore as ms
relu = nn.ReLU()

input_x = ms.Tensor(np.array([-1, 2, -3, 2, -1]), ms.float16)

output = relu(input_x)
print(output)
[0. 2. 0. 2. 0.]

nn.MaxPool2d

Initialize the nn.MaxPool2d layer and down-sample the 6 x 28 x 28 array to a 6 x 7 x 7 array.

import mindspore as ms
max_pool2d = nn.MaxPool2d(kernel_size=4, stride=4)
input_x = ms.Tensor(np.ones([1, 6, 28, 28]), ms.float32)

print(max_pool2d(input_x).shape)
(1, 6, 7, 7)

nn.Flatten

Initialize the nn.Flatten layer and convert the 1 x 16 x 5 x 5 array into 400 consecutive arrays.

import mindspore as ms
flatten = nn.Flatten()
input_x = ms.Tensor(np.ones([1, 16, 5, 5]), ms.float32)
output = flatten(input_x)

print(output.shape)
(1, 400)

nn.Dense

Initialize the nn.Dense layer and perform linear transformation on the input matrix.

import mindspore as ms
dense = nn.Dense(400, 120, weight_init='normal')
input_x = ms.Tensor(np.ones([1, 400]), ms.float32)
output = dense(input_x)

print(output.shape)
(1, 120)

Model Parameters

After instantiation is performed on the convolutional layer and the fully-connected layer in the network, there are a weight parameter and an offset parameter. These parameters are continuously optimized in a training process. During training, you can use get_parameters() to view the name, shape, and data type of each network layer, and whether backward calculation is performed.

for m in model.get_parameters():
    print(f"layer:{m.name}, shape:{m.shape}, dtype:{m.dtype}, requeires_grad:{m.requires_grad}")
layer:backbone.conv1.weight, shape:(6, 1, 5, 5), dtype:Float32, requeires_grad:True
layer:backbone.conv2.weight, shape:(16, 6, 5, 5), dtype:Float32, requeires_grad:True
layer:backbone.fc1.weight, shape:(120, 400), dtype:Float32, requeires_grad:True
layer:backbone.fc1.bias, shape:(120,), dtype:Float32, requeires_grad:True
layer:backbone.fc2.weight, shape:(84, 120), dtype:Float32, requeires_grad:True
layer:backbone.fc2.bias, shape:(84,), dtype:Float32, requeires_grad:True
layer:backbone.fc3.weight, shape:(10, 84), dtype:Float32, requeires_grad:True
layer:backbone.fc3.bias, shape:(10,), dtype:Float32, requeires_grad:True

Quickly Building a LeNet-5 Model

The preceding describes how to use mindspore.nn.cell to build a LeNet-5 model. The built network model API is available in mindvision.classification.models. You can also use the lenet API to directly build a LeNet-5 model.

from mindvision.classification.models import lenet

# `num_classes` indicates the number of classes, and `pretrained` determines whether to train with the trained model.
model = lenet(num_classes=10, pretrained=False)

for m in model.get_parameters():
    print(f"layer:{m.name}, shape:{m.shape}, dtype:{m.dtype}, requeires_grad:{m.requires_grad}")
layer:backbone.conv1.weight, shape:(6, 1, 5, 5), dtype:Float32, requeires_grad:True
layer:backbone.conv2.weight, shape:(16, 6, 5, 5), dtype:Float32, requeires_grad:True
layer:backbone.fc1.weight, shape:(120, 400), dtype:Float32, requeires_grad:True
layer:backbone.fc1.bias, shape:(120,), dtype:Float32, requeires_grad:True
layer:backbone.fc2.weight, shape:(84, 120), dtype:Float32, requeires_grad:True
layer:backbone.fc2.bias, shape:(84,), dtype:Float32, requeires_grad:True
layer:backbone.fc3.weight, shape:(10, 84), dtype:Float32, requeires_grad:True
layer:backbone.fc3.bias, shape:(10,), dtype:Float32, requeires_grad:True