文档反馈

问题文档片段

问题文档片段包含公式时,显示为空格。

提交类型
issue

有点复杂...

找人问问吧。

请选择提交类型

问题类型
规范和低错类

- 规范和低错类:

- 错别字或拼写错误,标点符号使用错误、公式错误或显示异常。

- 链接错误、空单元格、格式错误。

- 英文中包含中文字符。

- 界面和描述不一致,但不影响操作。

- 表述不通顺,但不影响理解。

- 版本号不匹配:如软件包名称、界面版本号。

易用性

- 易用性:

- 关键步骤错误或缺失,无法指导用户完成任务。

- 缺少主要功能描述、关键词解释、必要前提条件、注意事项等。

- 描述内容存在歧义指代不明、上下文矛盾。

- 逻辑不清晰,该分类、分项、分步骤的没有给出。

正确性

- 正确性:

- 技术原理、功能、支持平台、参数类型、异常报错等描述和软件实现不一致。

- 原理图、架构图等存在错误。

- 命令、命令参数等错误。

- 代码片段错误。

- 命令无法完成对应功能。

- 界面错误,无法指导操作。

- 代码样例运行报错、运行结果不符。

风险提示

- 风险提示:

- 对重要数据或系统存在风险的操作,缺少安全提示。

内容合规

- 内容合规:

- 违反法律法规,涉及政治、领土主权等敏感词。

- 内容侵权。

请选择问题类型

问题描述

点击输入详细问题描述,以帮助我们快速定位问题。

mindspore.nn.GRU

class mindspore.nn.GRU(*args, **kwargs)[source]

Stacked GRU (Gated Recurrent Unit) layers.

Apply GRU layer to the input.

There are two gates in a GRU model; one is update gate and the other is reset gate. Denote two consecutive time nodes as t1 and t. Given an input xt at time t, a hidden state ht1, the update and reset gate at time t is computed using a gating mechanism. Update gate zt is designed to protect the cell from perturbation by irrelevant inputs and past hidden state. Reset gate rt determines how much information should be reset from old hidden state. New memory state nt is calculated with the current input, on which the reset gate will be applied. Finally, current hidden state ht is computed with the calculated update grate and new memory state. The complete formulation is as follows.

rt=σ(Wirxt+bir+Whrh(t1)+bhr)zt=σ(Wizxt+biz+Whzh(t1)+bhz)nt=tanh(Winxt+bin+rt(Whnh(t1)+bhn))ht=(1zt)nt+zth(t1)

Here σ is the sigmoid function, and is the Hadamard product. W,b are learnable weights between the output and the input in the formula. For instance, Wir,bir are the weight and bias used to transform from input x to r. Details can be found in paper Learning Phrase Representations using RNN Encoder–Decoder for Statistical Machine Translation.

Note

When using GRU on Ascend, the hidden size only supports multiples of 16.

Parameters
  • input_size (int) – Number of features of input.

  • hidden_size (int) – Number of features of hidden layer.

  • num_layers (int) – Number of layers of stacked GRU. Default: 1.

  • has_bias (bool) – Whether the cell has bias b_ih and b_hh. Default: True.

  • batch_first (bool) – Specifies whether the first dimension of input x is batch_size. Default: False.

  • dropout (float) – If not 0.0, append Dropout layer on the outputs of each GRU layer except the last layer. Default 0.0. The range of dropout is [0.0, 1.0).

  • bidirectional (bool) – Specifies whether it is a bidirectional GRU, num_directions=2 if bidirectional=True otherwise 1. Default: False.

Inputs:
  • x (Tensor) - Tensor of data type mindspore.float32 or mindspore.float16 and shape (seq_len, batch_size, input_size) or (batch_size, seq_len, input_size).

  • hx (Tensor) - Tensor of data type mindspore.float32 or mindspore.float16 and shape (num_directions * num_layers, batch_size, hidden_size). The data type of hx must be the same as x.

  • seq_length (Tensor) - The length of each sequence in an input batch. Tensor of shape (batch_size). Default: None. This input indicates the real sequence length before padding to avoid padded elements have been used to compute hidden state and affect the final output. It is recommended to use this input when x has padding elements.

Outputs:

Tuple, a tuple contains (output, h_n).

  • output (Tensor) - Tensor of shape (seq_len, batch_size, num_directions * hidden_size) or (batch_size, seq_len, num_directions * hidden_size).

  • hx_n (Tensor) - Tensor of shape (num_directions * num_layers, batch_size, hidden_size).

Raises
  • TypeError – If input_size, hidden_size or num_layers is not an int.

  • TypeError – If has_bias, batch_first or bidirectional is not a bool.

  • TypeError – If dropout is not a float.

  • ValueError – If dropout is not in range [0.0, 1.0).

Supported Platforms:

Ascend GPU CPU

Examples

>>> net = nn.GRU(10, 16, 2, has_bias=True, batch_first=True, bidirectional=False)
>>> x = Tensor(np.ones([3, 5, 10]).astype(np.float32))
>>> h0 = Tensor(np.ones([1 * 2, 3, 16]).astype(np.float32))
>>> output, hn = net(x, h0)
>>> print(output.shape)
(3, 5, 16)