mindspore.nn.LayerNorm

class mindspore.nn.LayerNorm(normalized_shape, begin_norm_axis=- 1, begin_params_axis=- 1, gamma_init='ones', beta_init='zeros', epsilon=1e-07, dtype=mstype.float32)[source]

Applies Layer Normalization over a mini-batch of inputs.

Layer Normalization is widely used in recurrent neural networks. It applies normalization on a mini-batch of inputs for each single training case as described in the paper Layer Normalization. Unlike Batch Normalization, Layer Normalization performs exactly the same computation at training and testing time. It is applied across all channels and pixel but only one batch size. \(\gamma\) and \(\beta\) are trainable scale and shift. It can be described using the following formula:

\[y = \frac{x - \mathrm{E}[x]}{\sqrt{\mathrm{Var}[x] + \epsilon}} * \gamma + \beta\]
Parameters
  • normalized_shape (Union(tuple[int], list[int])) – The normalization is performed over axis begin_norm_axis … R - 1. R is the dimension size of input x.

  • begin_norm_axis (int) – The first normalization dimension: normalization will be performed along dimensions begin_norm_axis: R, the value should be in [-1, R). Default: -1 .

  • begin_params_axis (int) – The begin axis of the parameter input \((\gamma, \beta)\) to apply LayerNorm, the value should be in [-1, R). Default: -1 .

  • gamma_init (Union[Tensor, str, Initializer, numbers.Number]) – Initializer for the \(\gamma\) weight. The values of str refer to the function initializer including 'zeros' , 'ones' , 'xavier_uniform' , 'he_uniform' , etc. Default: 'ones' .

  • beta_init (Union[Tensor, str, Initializer, numbers.Number]) – Initializer for the \(\beta\) weight. The values of str refer to the function initializer including 'zeros' , 'ones' , 'xavier_uniform' , 'he_uniform' , etc. Default: 'zeros' .

  • epsilon (float) – A value added to the denominator for numerical stability(\(\epsilon\)). Default: 1e-7 .

  • dtype (mindspore.dtype) – Dtype of Parameters. Default: mstype.float32 .

Inputs:
  • x (Tensor) - The shape of x is \((x_1, x_2, ..., x_R)\), and input_shape[begin_norm_axis:] is equal to normalized_shape.

Outputs:

Tensor, the normalized and scaled offset tensor, has the same shape and data type as the x.

Raises
  • TypeError – If normalized_shape is neither a list nor tuple.

  • TypeError – If begin_norm_axis or begin_params_axis is not an int.

  • TypeError – If epsilon is not a float.

Supported Platforms:

Ascend GPU CPU

Examples

>>> import mindspore as ms
>>> import numpy as np
>>> x = ms.Tensor(np.ones([20, 5, 10, 10]), ms.float32)
>>> shape1 = x.shape[1:]
>>> m = ms.nn.LayerNorm(shape1,  begin_norm_axis=1, begin_params_axis=1)
>>> output = m(x).shape
>>> print(output)
(20, 5, 10, 10)