mindspore.nn.RNN

View Source On Gitee
class mindspore.nn.RNN(*args, **kwargs)[source]

Stacked Elman RNN layers, applying RNN layer with \(\tanh\) or \(\text{ReLU}\) non-linearity to the input.

For each element in the input sequence, each layer computes the following function:

\[h_t = activation(W_{ih} x_t + b_{ih} + W_{hh} h_{(t-1)} + b_{hh})\]

Here \(h_t\) is the hidden state at time t, \(x_t\) is the input at time t, and \(h_{(t-1)}\) is the hidden state of the previous layer at time \(t-1\) or the initial hidden state at time 0. \(W_{ih}\) is the learnable input-hidden weights, and \(b_{ih}\) is the learnable input-hidden bias. \(W_{hh}\) is the learnable hidden-hidden weights, and \(b_{hh}\) is the learnable hidden-hidden bias.

Parameters
  • input_size (int) – Number of features of input.

  • hidden_size (int) – Number of features of hidden layer.

  • num_layers (int) – Number of layers of stacked RNN. Default: 1 .

  • nonlinearity (str) – The non-linearity to use. Can be either 'tanh' or 'relu'. Default: 'tanh'.

  • has_bias (bool) – Whether the cell has bias \(b_{ih}\) and \(b_{hh}\). Default: True .

  • batch_first (bool) – Specifies whether the first dimension of input x is batch_size. Default: False .

  • dropout (float) – If not 0.0, append Dropout layer on the outputs of each RNN layer except the last layer. Default 0.0 . The range of dropout is [0.0, 1.0).

  • bidirectional (bool) – Specifies whether it is a bidirectional RNN, num_directions=2 if bidirectional=True otherwise 1. Default: False .

  • dtype (mindspore.dtype) – Dtype of Parameters. Default: mstype.float32 .

Inputs:
  • x (Tensor) - Tensor of data type mindspore.float32 or mindspore.float16 and shape \((seq\_len, batch\_size, input\_size)\) or \((batch\_size, seq\_len, input\_size)\) .

  • hx (Tensor) - Tensor of data type mindspore.float32 or mindspore.float16 and shape \((num\_directions * num\_layers, batch\_size, hidden\_size)\) .

  • seq_length (Tensor) - The length of each sequence in an input batch. Tensor of shape \((batch\_size)\) . Default: None . This input indicates the real sequence length before padding to avoid padded elements have been used to compute hidden state and affect the final output. It is recommended to use this input when x has padding elements.

Outputs:

Tuple, a tuple contains (output, hx_n).

  • output (Tensor) - Tensor of shape \((seq\_len, batch\_size, num\_directions * hidden\_size)\) or \((batch\_size, seq\_len, num\_directions * hidden\_size)\) .

  • hx_n (Tensor) - Tensor of shape \((num\_directions * num\_layers, batch\_size, hidden\_size)\) .

Raises
  • TypeError – If input_size, hidden_size or num_layers is not an int.

  • TypeError – If has_bias, batch_first or bidirectional is not a bool.

  • TypeError – If dropout is not a float.

  • ValueError – If dropout is not in range [0.0, 1.0).

  • ValueError – If nonlinearity is not in [‘tanh’, ‘relu’].

Supported Platforms:

Ascend GPU CPU

Examples

>>> import mindspore as ms
>>> import numpy as np
>>> net = ms.nn.RNN(10, 16, 2, has_bias=True, batch_first=True, bidirectional=False)
>>> x = ms.Tensor(np.ones([3, 5, 10]).astype(np.float32))
>>> h0 = ms.Tensor(np.ones([1 * 2, 3, 16]).astype(np.float32))
>>> output, hn = net(x, h0)
>>> print(output.shape)
(3, 5, 16)