mindspore.nn.RNN

class mindspore.nn.RNN(*args, **kwargs)[source]

Stacked Elman RNN layers.

Apply RNN layer with \(\tanh\) or \(\text{ReLU}\) non-linearity to the input.

For each element in the input sequence, each layer computes the following function:

\[h_t = \tanh(W_{ih} x_t + b_{ih} + W_{hh} h_{(t-1)} + b_{hh})\]

Here \(h_t\) is the hidden state at time t, \(x_t\) is the input at time t, and \(h_{(t-1)}\) is the hidden state of the previous layer at time t-1 or the initial hidden state at time 0. If nonlinearity is 'relu', then \(\text{ReLU}\) is used instead of \(\tanh\).

Parameters
  • input_size (int) – Number of features of input.

  • hidden_size (int) – Number of features of hidden layer.

  • num_layers (int) – Number of layers of stacked RNN. Default: 1.

  • nonlinearity (str) – The non-linearity to use. Can be either 'tanh' or 'relu'. Default: 'tanh'

  • has_bias (bool) – Whether the cell has bias b_ih and b_hh. Default: True.

  • batch_first (bool) – Specifies whether the first dimension of input x is batch_size. Default: False.

  • dropout (float) – If not 0.0, append Dropout layer on the outputs of each RNN layer except the last layer. Default 0.0. The range of dropout is [0.0, 1.0).

  • bidirectional (bool) – Specifies whether it is a bidirectional RNN, num_directions=2 if bidirectional=True otherwise 1. Default: False.

Inputs:
  • x (Tensor) - Tensor of data type mindspore.float32 and shape (seq_len, batch_size, input_size) or (batch_size, seq_len, input_size).

  • hx (Tensor) - Tensor of data type mindspore.float32 and shape (num_directions * num_layers, batch_size, hidden_size). Data type of hx must be the same as x.

  • seq_length (Tensor) - The length of each sequence in a input batch. Tensor of shape \((\text{batch_size})\). Default: None. This input indicates the real sequence length before padding to avoid padded elements have been used to compute hidden state and affect the final output. It is recommend to use this input when x has padding elements.

Outputs:

Tuple, a tuple contains (output, h_n).

  • output (Tensor) - Tensor of shape (seq_len, batch_size, num_directions * hidden_size) or (batch_size, seq_len, num_directions * hidden_size).

  • hx_n (Tensor) - Tensor of shape (num_directions * num_layers, batch_size, hidden_size).

Raises
  • TypeError – If input_size, hidden_size or num_layers is not an int.

  • TypeError – If has_bias, batch_first or bidirectional is not a bool.

  • TypeError – If dropout is neither a float nor an int.

  • ValueError – If dropout is not in range [0.0, 1.0).

  • ValueError – If nonlinearity is not in [‘tanh’, ‘relu’].

Supported Platforms:

Ascend GPU

Examples

>>> net = nn.RNN(10, 16, 2, has_bias=True, batch_first=True, bidirectional=False)
>>> x = Tensor(np.ones([3, 5, 10]).astype(np.float32))
>>> h0 = Tensor(np.ones([1 * 2, 3, 16]).astype(np.float32))
>>> output, hn = net(x, h0)
>>> print(output.shape)
(3, 5, 16)