Function Differences with torch.nn.BatchNorm1d
torch.nn.BatchNorm1d
class torch.nn.BatchNorm1d(
num_features,
eps=1e-05,
momentum=0.1,
affine=True,
track_running_stats=True
)(input) -> Tensor
For more information, see torch.nn.BatchNorm1d.
mindspore.nn.BatchNorm1d
class mindspore.nn.BatchNorm1d(
num_features,
eps=1e-5,
momentum=0.9,
affine=True,
gamma_init='ones',
beta_init='zeros',
moving_mean_init='zeros',
moving_var_init='ones',
use_batch_statistics=None,
data_format='NCHW'
)(x) -> Tensor
For more information, see mindspore.nn.BatchNorm1d.
Differences
PyTorch:Batch normalization of the input 2D or 3D data.
MindSpore:The implementation function of the API in MindSpore is basically the same as that of PyTorch. The default value of the momentum parameter in MindSpore is 0.9, and the momentum conversion relationship with PyTorch is 1-momentum. The behavior of the default value is the same as that of PyTorch. The parameter update strategy during training and inference is different from that of PyTorch. For details, please refer to Differences Between MindSpore and PyTorch - nn.BatchNorm2d.
Categories |
Subcategories |
PyTorch |
MindSpore |
Differences |
---|---|---|---|---|
Parameters |
Parameter 1 |
num_features |
num_features |
- |
Parameter 2 |
eps |
eps |
- |
|
Parameter 3 |
momentum |
momentum |
The function is the same, but the default value in PyTorch is 0.1, and in MindSpore is 0.9. The conversion relationship with PyTorch’s momentum is 1-momentum, and the default value behavior is the same as PyTorch |
|
Parameter 4 |
affine |
affine |
- |
|
Parameter 5 |
track_running_stats |
use_batch_statistics |
The function is the same, and different values correspond to different default methods. For details, please refer to Typical differences with PyTorch - BatchNorm |
|
Parameter 6 |
- |
gamma_init |
PyTorch does not have this parameter, while MindSpore can initialize the value of the parameter gamma |
|
Parameter 7 |
- |
beta_init |
PyTorch does not have this parameter, while MindSpore can initialize the value of the parameter beta |
|
Parameter 8 |
- |
moving_mean_init |
PyTorch does not have this parameter, while MindSpore can initialize the value of the parameter moving_mean |
|
Parameter 9 |
- |
moving_var_init |
PyTorch does not have this parameter, while MindSpore can initialize the value of the parameter moving_var |
|
Parameter 10 |
- |
data_format |
PyTorch does not have this parameter |
|
Input |
Single input |
input |
x |
Same function, different parameter names |
Code Example
The two APIs achieve the same function and have the same usage.
# PyTorch
import torch
import numpy as np
from torch import nn, tensor
net = nn.BatchNorm1d(4, affine=False, momentum=0.1)
x = tensor(np.array([[0.7, 0.5, 0.5, 0.6], [0.5, 0.4, 0.6, 0.9]]).astype(np.float32))
output = net(x)
print(output.detach().numpy())
# [[ 0.9995001 0.9980063 -0.998006 -0.99977785]
# [-0.9995007 -0.9980057 0.998006 0.99977785]]
# MindSpore
import numpy as np
import mindspore.nn as nn
from mindspore import Tensor
net = nn.BatchNorm1d(num_features=4, affine=False, momentum=0.9)
net.set_train()
# BatchNorm1d<num_features=4, eps=1e-05, momentum=0.9, gamma=Parameter (name=gamma, shape=(4,), dtype=Float32, requires_grad=False), beta=Parameter (name=beta, shape=(4,), dtype=Float32, requires_grad=False), moving_mean=Parameter (name=mean, shape=(4,), dtype=Float32, requires_grad=False), moving_variance=Parameter (name=variance, shape=(4,), dtype=Float32, requires_grad=False)>
x = Tensor(np.array([[0.7, 0.5, 0.5, 0.6], [0.5, 0.4, 0.6, 0.9]]).astype(np.float32))
output = net(x)
print(output.asnumpy())
# [[ 0.9995001 0.9980063 -0.998006 -0.9997778]
# [-0.9995007 -0.9980057 0.998006 0.9997778]]