Differences with torch.nn.RNN

torch.nn.RNN

class torch.nn.RNN(*args, **kwargs)(input, h_0)

For more information, see torch.nn.RNN.

mindspore.nn.RNN

class mindspore.nn.RNN(*args, **kwargs)(x, h_x, seq_length)

For more information, see mindspore.nn.RNN.

Differences

PyTorch: Recurrent Neural Network (RNN) layer.

MindSpore: Implement the same functions as PyTorch.

Categories	Subcategories	PyTorch	MindSpore	Difference
Parameters	Parameter 1	input_size	input_size	-
	Parameter 2	hidden_size	hidden_size	-
	Parameter 3	num_layers	num_layers	-
	Parameter 4	nonlinearity	nonlinearity	-
	Parameter 5	bias	has_bias	Same function, different parameter names
	Parameter 6	batch_first	batch_first	-
	Parameter 7	dropout	dropout	-
	Parameter 8	bidirectional	bidirectional	-
Inputs	Input 1	input	x	Same function, different parameter names
	Input 2	h_0	hx	Same function, different parameter names
	Input 3	-	seq_length	This input specifies the true sequence length to avoid using the filled elements to calculate the hidden state, which affects the final output. It is recommended to use this input when x is populated with elements. Default value: None.

Code Example 1# PyTorch
import torch
from torch import tensor
from torch import nn
import numpy as np

rnn = torch.nn.RNN(2, 3, 4, nonlinearity="relu", bias=False)
x = torch.tensor(np.array([[[3.0, 4.0]]]).astype(np.float32))
h_0 = torch.tensor(np.array([[[1.0, 2.0, 3]], [[3.0, 4.0, 5]], [[3.0, 4.0, 5]], [[3.0, 4.0, 5]]]).astype(np.float32))
output, hx_n = rnn(x, h_0)
print(output)
# tensor([[[0.0000, 0.4771, 0.8548]]], grad_fn=<StackBackward0>)
print(hx_n)
# tensor([[[0.0000, 0.5015, 0.0000]],
#        [[2.3183, 0.0000, 1.7400]],
#        [[2.0082, 0.0000, 1.4658]],
#        [[0.0000, 0.4771, 0.8548]]], grad_fn=<StackBackward0>)

# MindSpore
import mindspore
from mindspore import Tensor
import mindspore.nn as nn
import numpy as np

rnn = nn.RNN(2, 3, 4, nonlinearity="relu", has_bias=False)
x = Tensor(np.array([[[3.0, 4.0]]]).astype(np.float32))
h_0 = Tensor(np.array([[[1.0, 2.0, 3]], [[3.0, 4.0, 5]], [[3.0, 4.0, 5]], [[3.0, 4.0, 5]]]).astype(np.float32))
output, hx_n = rnn(x, h_0)
print(output)
# [[[2.2204838 0.        2.365325 ]]]
print(hx_n)
#[[[1.4659244  0.         1.3142354 ]]
# [[0.         0.16777739 0.        ]]
# [[3.131722   0.         0.        ]]
# [[2.2204838  0.         2.365325  ]]]