mindspore.nn.GELU

class mindspore.nn.GELU(approximate=True)[源代码]

高斯误差线性单元激活函数（Gaussian error linear unit activation function）。

对输入的每个元素计算GELU，输入可以是任意有效shape的Tensor。

GELU的定义如下：

G E L U (x_{i}) = x_{i} * P (X < x_{i}),

其中 $P$ 是标准高斯分布的累积分布函数， $x_{i}$ 是输入的元素。

GELU函数图：

参数：

approximate (bool，可选) - 是否启用approximation。默认值： True 。如果approximate的值为 True ，则高斯误差线性激活函数为：

$0.5 * x * (1 + t a n h (\sqrt{(} 2 / π) * (x + 0.044715 * x^{3})))$ 。

否则为： $x * P (X <= x) = 0.5 * x * (1 + e r f (x / \sqrt{(} 2)))$ ，其中 $P (X) N (0, 1)$ 。

说明

在计算gelu的输入梯度时，当输入为inf，Ascend与GPU在反向传播输出之间存在差异。
当输入x为-inf时，Ascend的计算结果为0，GPU的计算结果为nan。
当输入x为inf时，Ascend的计算结果为梯度dy，GPU的计算结果为nan。
数学意义上，Ascend的计算结果精度更高。

输入：

x (Tensor) - 用于计算GELU的Tensor。数据类型为float16、float32或float64。shape是 $(N, *)$ ， $*$ 表示任意的附加维度数。

输出：

Tensor，具有与 x 相同的数据类型和shape。

异常：

TypeError - x 的数据类型不是float16、float32或float64。

支持平台：

Ascend GPU CPU

样例：

>>> import mindspore
>>> from mindspore import Tensor, nn
>>> import numpy as np
>>> x = Tensor(np.array([[-1.0, 4.0, -8.0], [2.0, -5.0, 9.0]]), mindspore.float32)
>>> gelu = nn.GELU()
>>> output = gelu(x)
>>> print(output)
[[-1.5880802e-01  3.9999299e+00 -3.1077917e-21]
 [ 1.9545976e+00 -2.2918017e-07  9.0000000e+00]]
>>> gelu = nn.GELU(approximate=False)
>>> # CPU not support "approximate=False", using "approximate=True" instead
>>> output = gelu(x)
>>> print(output)
[[-1.5865526e-01  3.9998732e+00 -0.0000000e+00]
 [ 1.9544997e+00 -1.4901161e-06  9.0000000e+00]]