mindspore.nn.LayerNorm
- class mindspore.nn.LayerNorm(normalized_shape, begin_norm_axis=- 1, begin_params_axis=- 1, gamma_init='ones', beta_init='zeros', epsilon=1e-07)[source]
Applies Layer Normalization over a mini-batch of inputs.
Layer Normalization is widely used in recurrent neural networks. It applies normalization on a mini-batch of inputs for each single training case as described in the paper Layer Normalization. Unlike Batch Normalization, Layer Normalization performs exactly the same computation at training and testing time. It can be described using the following formula. It is applied across all channels and pixel but only one batch size.
\[y = \frac{x - \mathrm{E}[x]}{\sqrt{\mathrm{Var}[x] + \epsilon}} * \gamma + \beta\]- Parameters
normalized_shape (Union(tuple[int], list[int]) – The normalization is performed over axis begin_norm_axis … R - 1.
begin_norm_axis (int) – The first normalization dimension: normalization will be performed along dimensions begin_norm_axis: rank(inputs), the value should be in [-1, rank(input)). Default: -1.
begin_params_axis (int) – The first parameter(beta, gamma)dimension: scale and centering parameters will have dimensions begin_params_axis: rank(inputs) and will be broadcast with the normalized inputs accordingly, the value should be in [-1, rank(input)). Default: -1.
gamma_init (Union[Tensor, str, Initializer, numbers.Number]) – Initializer for the gamma weight. The values of str refer to the function initializer including ‘zeros’, ‘ones’, ‘xavier_uniform’, ‘he_uniform’, etc. Default: ‘ones’.
beta_init (Union[Tensor, str, Initializer, numbers.Number]) – Initializer for the beta weight. The values of str refer to the function initializer including ‘zeros’, ‘ones’, ‘xavier_uniform’, ‘he_uniform’, etc. Default: ‘zeros’.
epsilon (float) – A value added to the denominator for numerical stability. Default: 1e-7.
- Inputs:
input_x (Tensor) - The shape of ‘input_x’ is \((x_1, x_2, ..., x_R)\), and input_shape[begin_norm_axis:] is equal to normalized_shape.
- Outputs:
Tensor, the normalized and scaled offset tensor, has the same shape and data type as the input_x.
- Raises
- Supported Platforms:
Ascend
GPU
CPU
Examples
>>> x = Tensor(np.ones([20, 5, 10, 10]), mindspore.float32) >>> shape1 = x.shape[1:] >>> m = nn.LayerNorm(shape1, begin_norm_axis=1, begin_params_axis=1) >>> output = m(x).shape >>> print(output) (20, 5, 10, 10)