mindspore.ops.GeLU

View Source On Gitee
class mindspore.ops.GeLU[source]

Gaussian Error Linear Units activation function.

GeLU is described in the paper Gaussian Error Linear Units (GELUs). And also please refer to BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding.

GeLU is defined as follows:

\[GELU(x_i) = x_i*P(X < x_i)\]

where \(P\) is the cumulative distribution function of the standard Gaussian distribution, \(x_i\) is the input element.

Note

When calculating the input gradient of GELU with an input value of infinity, there are differences in the output of the backward between 'Ascend' and 'GPU'. when x is -inf, the computation result of 'Ascend' is 0, and the computation result of 'GPU' is Nan. when x is inf, the computation result of 'Ascend' is dy, and the computation result of 'GPU' is Nan. In mathematical terms, Ascend's result has higher precision.

Inputs:
  • x (Tensor) - The input of the activation function GeLU, the data type is float16, float32 or float64.

Outputs:

Tensor, with the same type and shape as x.

Raises
  • TypeError – If x is not a Tensor.

  • TypeError – If dtype of x is not float16, float32 or float64.

Supported Platforms:

Ascend GPU CPU

Examples

>>> import mindspore
>>> import numpy as np
>>> from mindspore import Tensor, ops
>>> x = Tensor(np.array([1.0, 2.0, 3.0]), mindspore.float32)
>>> result = ops.GeLU()(x)
>>> print(result)
[0.841192  1.9545976  2.9963627]