mindspore.mint.nn.functional.gelu

mindspore.mint.nn.functional.gelu(input, *, approximate='none') → Tensor[源代码]

高斯误差线性单元激活函数。

GeLU的描述可以在 Gaussian Error Linear Units (GELUs) 这篇文章中找到。详情可查询 BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding 。

当 approximate 为 none ，GELU的定义如下：

G E L U (x_{i}) = x_{i} * P (X < x_{i}),

其中 $P$ 是标准高斯分布的累积分布函数， $x_{i}$ 是输入的元素。

当 approximate 为 tanh ，GELU的定义如下：

G E L U (x_{i}) = 0.5 * x_{i} * (1 + \tanh (\sqrt{(} 2 / π) * (x_{i} + 0.044715 * x_{i}^{3})))

GELU函数图：

说明

在Ascend平台上，当 input 为-inf时，其梯度为0，当 input 为inf时，其梯度为 dout 。

参数：

input (Tensor) - 用于计算GELU的Tensor。数据类型是float16、float32或float64。

关键字参数：

approximate (str，可选) - gelu近似算法。有两种：'none' 和 'tanh' 。默认值： 'none' 。

返回：

Tensor，具有与 input 相同的数据类型和shape。

异常：

TypeError - 如果 input 的数据类型不是Tensor。
TypeError - input 的数据类型既不是bfloat16、float16、float32或者float64。
ValueError - 如果 approximate 的值既不是 none 也不是 tanh。

支持平台：

Ascend

样例：

>>> import mindspore
>>> import numpy as np
>>> from mindspore import Tensor, mint
>>> input = Tensor(np.array([[-1.0, 4.0, -8.0], [2.0, -5.0, 9.0]]), mindspore.float32)
>>> result = mint.nn.functional.gelu(input)
>>> print(result)
[[-1.58655241e-01  3.99987316e+00 -0.00000000e+00]
 [ 1.95449972e+00 -1.41860323e-06  9.0000000e+00]]
>>> result = mint.nn.functional.gelu(input, approximate="tanh")
>>> print(result)
[[-1.58808023e-01  3.99992990e+00 -3.10779147e-21]
 [ 1.95459759e+00 -2.29180174e-07  9.0000000e+00]]