Function Differences with torch.nn.BatchNorm1d

View Source On Gitee

torch.nn.BatchNorm1d

class torch.nn.BatchNorm1d(
    num_features,
    eps=1e-05,
    momentum=0.1,
    affine=True,
    track_running_stats=True
)

For more information, see torch.nn.BatchNorm1d.

mindspore.nn.BatchNorm1d

class mindspore.nn.BatchNorm1d(
    num_features,
    eps=1e-05,
    momentum=0.9,
    affine=True,
    gamma_init="ones",
    beta_init="zeros",
    moving_mean_init="zeros",
    moving_var_init="ones",
    use_batch_statistics=None)
)

For more information, see mindspore.nn.BatchNorm1d.

Differences

PyTorch:The default value of the momentum parameter used for running_mean and running_var calculation is 0.1.

MindSpore:The default value of the momentum parameter is 0.9, and the momentum relationship with Pytorch is 1-momentum, that is, when Pytorch’s momentum value is 0.2, MindSpore’s momemtum should be 0.8.

Code Example

# The following implements BatchNorm1d with MindSpore.
import numpy as np
import torch
import mindspore.nn as nn
from mindspore import Tensor

net = nn.BatchNorm1d(num_features=4, momentum=0.8)
x = Tensor(np.array([[0.7, 0.5, 0.5, 0.6],
                     [0.5, 0.4, 0.6, 0.9]]).astype(np.float32))
output = net(x)
print(output)
# Out:
# [[ 0.6999965   0.4999975  0.4999975  0.59999704 ]
#  [ 0.4999975   0.399998   0.59999704 0.89999545 ]]


# The following implements BatchNorm1d with torch.
input_x = torch.randn(2, 4)
m = torch.nn.BatchNorm1d(4, momentum=0.2)
output = m(input_x)
print(output)
# Out:
# tensor([[-0.9991, -1.0000, -1.0000,  1.0000],
#         [ 0.9991,  1.0000,  1.0000, -1.0000]],
#        grad_fn=<NativeBatchNormBackward>)