mindspore.ops.BatchNorm

class mindspore.ops.BatchNorm(is_training=False, epsilon=1e-05, momentum=0.1, data_format='NCHW')[source]

Batch Normalization for input data and updated parameters.

Batch Normalization is widely used in convolutional neural networks. This operation applies Batch Normalization over inputs to avoid internal covariate shift as described in the paper Batch Normalization: Accelerating Deep Network Training by Reducing Internal Covariate Shift. It rescales and recenters the features using a mini-batch of data and the learned parameters can be described in the following formula,

\[y = \frac{x - mean}{\sqrt{variance + \epsilon}} * \gamma + \beta\]

where \(\gamma\) is scale, \(\beta\) is bias, \(\epsilon\) is epsilon, \(mean\) is the mean of \(x\), \(variance\) is the variance of \(x\).

Warning

If the operation is used for inference, and outputs “reserve_space_1” and “reserve_space_2” are available, then “reserve_space_1” has the same value as “mean” and “reserve_space_2” has the same value as “variance”.
For Ascend 310, the result accuracy fails to reach 1‰ due to the square root instruction.

Parameters

is_training (bool) – If is_training is True , mean and variance are computed during training. If is_training is False , they’re loaded from checkpoint during inference. Default: False .
epsilon (float) – A small value added for numerical stability. Default: 1e-5, value must be (0, 1] .
momentum (float) – The hyper parameter to compute moving average for running_mean and running_var (e.g. \(new\_running\_mean = (1 - momentum) * running\_mean + momentum * current\_mean\)). Momentum value must be [0, 1]. Default: 0.1 .
data_format (str) – The optional value for data format, is 'NHWC' or 'NCHW', and the 'NHWC' format is only supported in GPU target. Default: "NCHW" .

Inputs:

If is_training is False , inputs are Tensors.

input_x (Tensor) - Tensor of shape \((N, C)\), with float16 or float32 data type.
scale (Tensor) - Tensor of shape \((C,)\), with float16 or float32 data type.
bias (Tensor) - Tensor of shape \((C,)\), has the same data type with scale.
mean (Tensor) - Tensor of shape \((C,)\), has the same data type with scale.
variance (Tensor) - Tensor of shape \((C,)\), has the same data type with scale.

If is_training is True , scale, bias, mean and variance are Parameters.

input_x (Tensor) - Tensor of shape \((N, C)\), with float16 or float32 data type.
scale (Parameter) - Parameter of shape \((C,)\), with float16 or float32 data type.
bias (Parameter) - Parameter of shape \((C,)\), has the same data type with scale.
mean (Parameter) - Parameter of shape \((C,)\), has the same data type with scale.
variance (Parameter) - Parameter of shape \((C,)\), has the same data type with scale.

Outputs:

Tuple of 5 Tensors, the normalized inputs and the updated parameters.

output_x (Tensor) - The same type and shape as the input_x. The shape is \((N, C)\).
batch_mean (Tensor) - The mean calculated per-dimension over the mini-batches, shape is \((C,)\).
batch_variance (Tensor) - The variance calculated per-dimension over the mini-batches, shape is \((C,)\).
reserve_space_1 (Tensor) - The mean that needs to be reused when calculating gradients, one-dimensional Tensor. The shape is \((C,)\).
reserve_space_2 (Tensor) - The variance that needs to be reused when calculating gradients, one-dimensional Tensor. The shape is \((C,)\).

Raises

TypeError – If is_training is not a bool.
TypeError – If dtype of epsilon or momentum is not float.
TypeError – If data_format is not a str.
TypeError – If input_x, scale, bias, mean or variance is not a Tensor.
TypeError – If dtype of input_x, scale is neither float16 nor float32.

Supported Platforms:: Ascend GPU CPU

Examples

>>> import mindspore
>>> import numpy as np
>>> from mindspore import Tensor, ops
>>> input_x = Tensor(np.ones([2, 2]), mindspore.float32)
>>> scale = Tensor(np.ones([2]), mindspore.float32)
>>> bias = Tensor(np.ones([2]), mindspore.float32)
>>> mean = Tensor(np.ones([2]), mindspore.float32)
>>> variance = Tensor(np.ones([2]), mindspore.float32)
>>> batch_norm = ops.BatchNorm()
>>> output = batch_norm(input_x, scale, bias, mean, variance)
>>> print(output[0])
[[1. 1.]
 [1. 1.]]