Document feedback

Question document fragment

When a question document fragment contains a formula, it is displayed as a space.

Submission type
issue

It's a little complicated...

I'd like to ask someone.

Please select the submission type

Problem type
Specifications and Common Mistakes

- Specifications and Common Mistakes:

- Misspellings or punctuation mistakes,incorrect formulas, abnormal display.

- Incorrect links, empty cells, or wrong formats.

- Chinese characters in English context.

- Minor inconsistencies between the UI and descriptions.

- Low writing fluency that does not affect understanding.

- Incorrect version numbers, including software package names and version numbers on the UI.

Usability

- Usability:

- Incorrect or missing key steps.

- Missing main function descriptions, keyword explanation, necessary prerequisites, or precautions.

- Ambiguous descriptions, unclear reference, or contradictory context.

- Unclear logic, such as missing classifications, items, and steps.

Correctness

- Correctness:

- Technical principles, function descriptions, supported platforms, parameter types, or exceptions inconsistent with that of software implementation.

- Incorrect schematic or architecture diagrams.

- Incorrect commands or command parameters.

- Incorrect code.

- Commands inconsistent with the functions.

- Wrong screenshots.

- Sample code running error, or running results inconsistent with the expectation.

Risk Warnings

- Risk Warnings:

- Lack of risk warnings for operations that may damage the system or important data.

Content Compliance

- Content Compliance:

- Contents that may violate applicable laws and regulations or geo-cultural context-sensitive words and expressions.

- Copyright infringement.

Please select the type of question

Problem description

Describe the bug so that we can quickly locate the problem.

mindspore.nn.GRU

class mindspore.nn.GRU(*args, **kwargs)[source]

Stacked GRU (Gated Recurrent Unit) layers.

Apply GRU layer to the input.

There are two gates in a GRU model. One is update gate and the other is reset gate. Denote two consecutive time nodes as t1 and t. Given an input xt at time t, a hidden state ht1, the update and reset gate at time t is computed using a gating mechanism. Update gate zt is designed to protect the cell from perturbation by irrelevant inputs and past hidden state. Reset gate rt determines how much information should be reset from old hidden state. New memory state nt is calculated with the current input, on which the reset gate will be applied. Finally, current hidden state ht is computed with the calculated update grate and new memory state. The complete formulation is as follows:

rt=σ(Wirxt+bir+Whrh(t1)+bhr)zt=σ(Wizxt+biz+Whzh(t1)+bhz)nt=tanh(Winxt+bin+rt(Whnh(t1)+bhn))ht=(1zt)nt+zth(t1)

Here σ is the sigmoid function, and is the Hadamard product. W,b are learnable weights between the output and the input in the formula. For instance, Wir,bir are the weight and bias used to transform from input x to r. Details can be found in paper Learning Phrase Representations using RNN Encoder–Decoder for Statistical Machine Translation.

Note

When using GRU on Ascend, the hidden size only supports multiples of 16.

Parameters
  • input_size (int) – Number of features of input.

  • hidden_size (int) – Number of features of hidden layer.

  • num_layers (int) – Number of layers of stacked GRU. Default: 1 .

  • has_bias (bool) – Whether the cell has bias bin and bhn. Default: True .

  • batch_first (bool) – Specifies whether the first dimension of input x is batch_size. Default: False .

  • dropout (float) – If not 0.0, append Dropout layer on the outputs of each GRU layer except the last layer. Default 0.0 . The range of dropout is [0.0, 1.0).

  • bidirectional (bool) – Specifies whether it is a bidirectional GRU, num_directions=2 if bidirectional=True otherwise 1. Default: False .

  • dtype (mindspore.dtype) – Dtype of Parameters. Default: mstype.float32 .

Inputs:
  • x (Tensor) - Tensor of data type mindspore.float32 or mindspore.float16 and shape (seq_len,batch_size,input_size) or (batch_size,seq_len,input_size).

  • hx (Tensor) - Tensor of data type mindspore.float32 or mindspore.float16 and shape (num_directionsnum_layers,batch_size,hidden_size).

  • seq_length (Tensor) - The length of each sequence in an input batch. Tensor of shape (batch_size). Default: None . This input indicates the real sequence length before padding to avoid padded elements have been used to compute hidden state and affect the final output. It is recommended to use this input when x has padding elements.

Outputs:

Tuple, a tuple contains (output, h_n).

  • output (Tensor) - Tensor of shape (seq_len,batch_size,num_directionshidden_size) or (batch_size,seq_len,num_directionshidden_size).

  • hx_n (Tensor) - Tensor of shape (num_directionsnum_layers,batch_size,hidden_size).

Raises
  • TypeError – If input_size, hidden_size or num_layers is not an int.

  • TypeError – If has_bias, batch_first or bidirectional is not a bool.

  • TypeError – If dropout is not a float.

  • ValueError – If dropout is not in range [0.0, 1.0).

Supported Platforms:

Ascend GPU CPU

Examples

>>> import mindspore as ms
>>> import numpy as np
>>> net = ms.nn.GRU(10, 16, 2, has_bias=True, batch_first=True, bidirectional=False)
>>> x = ms.Tensor(np.ones([3, 5, 10]).astype(np.float32))
>>> h0 = ms.Tensor(np.ones([1 * 2, 3, 16]).astype(np.float32))
>>> output, hn = net(x, h0)
>>> print(output.shape)
(3, 5, 16)