mindspore.ops.BatchNorm
- class mindspore.ops.BatchNorm(is_training=False, epsilon=1e-05, momentum=0.1, data_format='NCHW')[source]
Batch Normalization for input data and updated parameters.
Batch Normalization is widely used in convolutional neural networks. This operation applies Batch Normalization over inputs to avoid internal covariate shift as described in the paper Batch Normalization: Accelerating Deep Network Training by Reducing Internal Covariate Shift. It rescales and recenters the features using a mini-batch of data and the learned parameters can be described in the following formula,
\[y = \frac{x - mean}{\sqrt{variance + \epsilon}} * \gamma + \beta\]where \(\gamma\) is scale, \(\beta\) is bias, \(\epsilon\) is epsilon, \(mean\) is the mean of \(x\), \(variance\) is the variance of \(x\).
Warning
If the operation is used for inference, and outputs “reserve_space_1” and “reserve_space_2” are available, then “reserve_space_1” has the same value as “mean” and “reserve_space_2” has the same value as “variance”.
For Ascend 310, the result accuracy fails to reach 1‰ due to the square root instruction.
- Parameters
is_training (bool) – If is_training is
True
, mean and variance are computed during training. If is_training isFalse
, they’re loaded from checkpoint during inference. Default:False
.epsilon (float) – A small value added for numerical stability. Default:
1e-5
, value must be (0, 1] .momentum (float) – The hyper parameter to compute moving average for running_mean and running_var (e.g. \(new\_running\_mean = (1 - momentum) * running\_mean + momentum * current\_mean\)). Momentum value must be [0, 1]. Default:
0.1
.data_format (str) – The optional value for data format, is
'NHWC'
or'NCHW'
, and the'NHWC'
format is only supported in GPU target. Default:"NCHW"
.
- Inputs:
If is_training is
False
, inputs are Tensors.input_x (Tensor) - Tensor of shape \((N, C)\), with float16 or float32 data type.
scale (Tensor) - Tensor of shape \((C,)\), with float16 or float32 data type.
bias (Tensor) - Tensor of shape \((C,)\), has the same data type with scale.
mean (Tensor) - Tensor of shape \((C,)\), has the same data type with scale.
variance (Tensor) - Tensor of shape \((C,)\), has the same data type with scale.
If is_training is
True
, scale, bias, mean and variance are Parameters.input_x (Tensor) - Tensor of shape \((N, C)\), with float16 or float32 data type.
scale (Parameter) - Parameter of shape \((C,)\), with float16 or float32 data type.
bias (Parameter) - Parameter of shape \((C,)\), has the same data type with scale.
mean (Parameter) - Parameter of shape \((C,)\), has the same data type with scale.
variance (Parameter) - Parameter of shape \((C,)\), has the same data type with scale.
- Outputs:
Tuple of 5 Tensors, the normalized inputs and the updated parameters.
output_x (Tensor) - The same type and shape as the input_x. The shape is \((N, C)\).
batch_mean (Tensor) - The mean calculated per-dimension over the mini-batches, shape is \((C,)\).
batch_variance (Tensor) - The variance calculated per-dimension over the mini-batches, shape is \((C,)\).
reserve_space_1 (Tensor) - The mean that needs to be reused when calculating gradients, one-dimensional Tensor. The shape is \((C,)\).
reserve_space_2 (Tensor) - The variance that needs to be reused when calculating gradients, one-dimensional Tensor. The shape is \((C,)\).
- Raises
- Supported Platforms:
Ascend
GPU
CPU
Examples
>>> import mindspore >>> import numpy as np >>> from mindspore import Tensor, ops >>> input_x = Tensor(np.ones([2, 2]), mindspore.float32) >>> scale = Tensor(np.ones([2]), mindspore.float32) >>> bias = Tensor(np.ones([2]), mindspore.float32) >>> mean = Tensor(np.ones([2]), mindspore.float32) >>> variance = Tensor(np.ones([2]), mindspore.float32) >>> batch_norm = ops.BatchNorm() >>> output = batch_norm(input_x, scale, bias, mean, variance) >>> print(output[0]) [[1. 1.] [1. 1.]]