Parameter Initialization

Initializing with Built-In Parameters

MindSpore provides a variety of network parameter initialization methods, and encapsulates the function of parameter initialization in some operators. This section takes Conv2d as an example to introduce how to use the subclass, Initializer, and string to initialize parameters.

Initializer Initialization

Initializer is the built-in parameter initialization base class of MindSpore. All built-in parameter initialization methods inherit this class. The neural network layer package in mindspore.nn provides input parameters weight_init, bias_init, etc., which can be directly initialized with the instantiated Initializer. Examples are as follows:

import numpy as np
import mindspore.nn as nn
import mindspore as ms
from mindspore.common.initializer import Normal, initializer

input_data = ms.Tensor(np.ones([1, 3, 16, 50], dtype=np.float32))
# Convolution layer, the input channel is 3, the output channel is 64, the size of convolution kernel is 3 * 3, and the weight parameter uses the random number generated by normal distribution, Nomal().
net = nn.Conv2d(3, 64, 3, weight_init=Normal(0.2))
# The network output
output = net(input_data)

String Initialization

In addition to using the instantiated Initializer, MindSpore also provides a simple method for parameter initialization, that is, using the string of initializing method name. This method uses the default parameters of the Initializer to initialize. Examples are as follows:

import numpy as np
import mindspore.nn as nn
import mindspore as ms

net = nn.Conv2d(3, 64, 3, weight_init='normal')
output = net(input_data)

Customized Parameter Initialization

In general, the default parameter initialization provided by MindSpore can meet the initialization requirements of the common neural network layer. When encountering a parameter initialization method that needs to be customized, you can inherit the Initializer custom parameter initialization method. Take XavierNormal as an example:

import math
import numpy as np
from mindspore.common.initializer import Initializer, _calculate_fan_in_and_fan_out, _assignment

class XavierNormal(Initializer):
    def __init__(self, gain=1):
        super().__init__()
        # Configure the parameters required for initialization
        self.gain = gain

    def _initialize(self, arr): # arr is a Tensor to be initialized
        fan_in, fan_out = _calculate_fan_in_and_fan_out(arr.shape) # Compute fan_in, fan_out

        std = self.gain * math.sqrt(2.0 / float(fan_in + fan_out)) # Calculate std value
        data = np.random.normal(0, std, arr.shape) # Construct the initialized array with numpy

        _assignment(arr, data) # Assign the initialized ndarray to arr

After that, we can call it like the built-in initialization method:

net = nn.Conv2d(3, 64, 3, weight_init=XavierNormal())
# The network output
output = net(input_data)

Cell traversal initialization

In addition to using parameters weight_init, bias_init, etc., provided by mindspore.nn, we are also used to constructing a complete neural network first, and then uniformly managing the weight, bias and other parameters. At this time, you need to construct a network and instantiate it, then traverse the cell and assign values to parameters. Here is a simple example:

for name, param in net.parameters_and_names():
    if 'weight' in name:
        param.set_data(initializer(Normal(), param.shape, param.dtype))
    if 'bias' in name:
        param.set_data(initializer('zeros', param.shape, param.dtype))