mindspore.nn.GRU

class mindspore.nn.GRU(*args, **kwargs)[源代码]

GRU（Gate Recurrent Unit）称为门控循环单元网络，是循环神经网络（Recurrent Neural Network, RNN）的一种。根据输出序列和给定的初始状态计算输出序列和最终状态。

应用GRU层到输入中。

GRU网络模型中有两个门。一个是更新门，另一个是重置门。将两个连续的时间节点表示为 $t - 1$ 和 $t$ 。给定一个在时刻 $t$ 的输入 $x_{t}$ ，一个隐藏状态 $h_{t - 1}$ ，在时刻 $t$ 的更新门和重置门使用门控机制计算。更新门 $z_{t}$ 用于控制前一时刻的状态信息被带入到当前状态中的程度。重置门 $r_{t}$ 控制前一状态有多少信息被写入到当前候选集 $n_{t}$ 上。完整的公式如下。

\begin{array}{r} \begin{array}{ll} r_{t} = σ (W_{i r} x_{t} + b_{i r} + W_{h r} h_{(t - 1)} + b_{h r}) \\ z_{t} = σ (W_{i z} x_{t} + b_{i z} + W_{h z} h_{(t - 1)} + b_{h z}) \\ n_{t} = \tanh (W_{i n} x_{t} + b_{i n} + r_{t} * (W_{h n} h_{(t - 1)} + b_{h n})) \\ h_{t} = (1 - z_{t}) * n_{t} + z_{t} * h_{(t - 1)} \end{array} \end{array}

其中 $σ$ 是sigmoid激活函数， $*$ 是乘积。 $W, b$ 是公式中输出和输入之间的可学习权重。例如， $W_{i r}, b_{i r}$ 是用于将输入 $x$ 转换为 $r$ 的权重和偏置。详见论文 Learning Phrase Representations using RNN Encoder-Decoder for Statistical Machine Translation 。

说明

当GRU运行在Ascend上时，hidden size仅支持16的倍数。

参数：

input_size (int) - 输入的大小。
hidden_size (int) - 隐藏状态大小。
num_layers (int) - 网络层数。默认值： 1 。
has_bias (bool) - cell是否有偏置项 $b_{i n}$ 和 $b_{h n}$ 。默认值： True 。
batch_first (bool) - 指定输入 x 的第一个维度是否为batch_size。默认值： False 。
dropout (float) - 指的是除第一层外每层输入时的Dropout概率。默认值： 0.0 。Dropout的范围为[0.0, 1.0)。
bidirectional (bool) - 是否为双向GRU。如果bidirectional=True，则num_directions=2，为双向GRU。否则为1，单向GRU。默认值： False 。
dtype (mindspore.dtype) - Parameters的dtype。默认值： mstype.float32 。

输入：

x (Tensor) - 数据类型为mindspore.float32、shape为 $(s e q_l e n, b a t c h_s i z e, i n p u t_s i z e)$ 或 $(b a t c h_s i z e, s e q_l e n, i n p u t_s i z e)$ 的Tensor。
hx (Tensor) - 数据类型为mindspore.float32、shape为 $(n u m_d i r e c t i o n s * n u m_l a y e r s, b a t c h_s i z e, h i d d e n_s i z e)$ 的Tensor。
seq_length (Tensor) - 输入batch中每个序列的长度。shape为 $(batch_size)$ 的Tensor。默认值： None 。此输入指示填充前的真实序列长度，避免填充元素被用于计算隐藏状态而影响最终输出。当 x 含填充元素时，建议使用此输入。

输出：

Tuple，包含(output, h_n)的tuple。

output (Tensor) - shape为 $(s e q_l e n, b a t c h_s i z e, n u m_d i r e c t i o n s * h i d d e n_s i z e)$ 或 $(b a t c h_s i z e, s e q_l e n, n u m_d i r e c t i o n s * h i d d e n_s i z e)$ 的Tensor。
hx_n (Tensor) - shape为 $(n u m_d i r e c t i o n s * n u m_l a y e r s, b a t c h_s i z e, h i d d e n_s i z e)$ 的Tensor。

异常：

TypeError - input_size ， hidden_size 或 num_layers 不是整数。
TypeError - has_bias ， batch_first 或 bidirectional 不是bool。
TypeError - dropout 既不是浮点数也不是整数。
ValueError - dropout 不在[0.0, 1.0)范围内。

支持平台：

Ascend GPU CPU

样例：

>>> import mindspore as ms
>>> import numpy as np
>>> net = ms.nn.GRU(10, 16, 2, has_bias=True, batch_first=True, bidirectional=False)
>>> x = ms.Tensor(np.ones([3, 5, 10]).astype(np.float32))
>>> h0 = ms.Tensor(np.ones([1 * 2, 3, 16]).astype(np.float32))
>>> output, hn = net(x, h0)
>>> print(output.shape)
(3, 5, 16)