mindspore.nn.GRUCell

View Source On Gitee
class mindspore.nn.GRUCell(input_size: int, hidden_size: int, has_bias: bool = True, dtype=mstype.float32)[source]

A GRU(Gated Recurrent Unit) cell.

\[\begin{split}\begin{array}{ll} r = \sigma(W_{ir} x + b_{ir} + W_{hr} h + b_{hr}) \\ z = \sigma(W_{iz} x + b_{iz} + W_{hz} h + b_{hz}) \\ n = \tanh(W_{in} x + b_{in} + r * (W_{hn} h + b_{hn})) \\ h' = (1 - z) * n + z * h \end{array}\end{split}\]

Here \(\sigma\) is the sigmoid function, and \(*\) is the Hadamard product. \(W, b\) are learnable weights between the output and the input in the formula. \(h\) is hidden state. \(r\) is reset gate. \(z\) is update gate. \(n\) is n-th layer. For instance, \(W_{ir}, b_{ir}\) are the weight and bias used to transform from input \(x\) to \(r\). Details can be found in paper Learning Phrase Representations using RNN Encoder–Decoder for Statistical Machine Translation.

Parameters
  • input_size (int) – Number of features of input.

  • hidden_size (int) – Number of features of hidden layer.

  • has_bias (bool) – Whether the cell has bias \(b_{in}\) and \(b_{hn}\). Default: True .

  • dtype (mindspore.dtype) – Dtype of Parameters. Default: mstype.float32 .

Inputs:
  • x (Tensor) - Tensor of shape \((batch\_size, input\_size)\) .

  • hx (Tensor) - Tensor of data type mindspore.float32 and shape \((batch\_size, hidden\_size)\) .

Outputs:
  • hx' (Tensor) - Tensor of shape \((batch\_size, hidden\_size)\) .

Raises
  • TypeError – If input_size, hidden_size is not an int.

  • TypeError – If has_bias is not a bool.

Supported Platforms:

Ascend GPU CPU

Examples

>>> import mindspore as ms
>>> import numpy as np
>>> net = ms.nn.GRUCell(10, 16)
>>> x = ms.Tensor(np.ones([5, 3, 10]).astype(np.float32))
>>> hx = ms.Tensor(np.ones([3, 16]).astype(np.float32))
>>> output = []
>>> for i in range(5):
...     hx = net(x[i], hx)
...     output.append(hx)
>>> print(output[0].shape)
(3, 16)