Differences with torch.nn.GRU
torch.nn.GRU
class torch.nn.GRU(
input_size,
hidden_size,
num_layers=1,
bias=True,
batch_first=False,
dropout=0,
bidirectional=False)(input, h_0) -> Tensor
For more information, see torch.nn.GRU.
mindspore.nn.GRU
class mindspore.nn.GRU(
input_size,
hidden_size,
num_layers=1,
has_bias=True,
batch_first=False,
dropout=0.0,
bidirectional=False)(x, hx, seq_length) -> Tensor
For more information, see mindspore.nn.GRU.
Differences
PyTorch: Calculate the output sequence and the final state based on the output sequence and the given initial state.
MindSpore: Consistent function. One more interface input seq_length, indicating the length of each sequence in the input batch.
Categories |
Subcategories |
PyTorch |
MindSpore |
Difference |
---|---|---|---|---|
Parameters |
Parameter 1 |
input_size |
input_size |
- |
Parameter 2 |
hidden_size |
hidden_size |
- |
|
Parameter 3 |
num_layers |
num_layers |
- |
|
Parameter 4 |
bias |
has_bias |
Same function, different parameter names |
|
Parameter 5 |
batch_first |
batch_first |
- |
|
Parameter 6 |
dropout |
dropout |
- |
|
Parameter 7 |
bidirectional |
bidirectional |
- |
|
Input |
Input 1 |
input |
x |
Same function, different parameter names |
Input 2 |
h_0 |
hx |
Same function, different parameter names |
|
Input 3 |
- |
seq_length |
The length of each sequence in input batch |
Code Example
The two APIs achieve the same function and have the same usage.
# PyTorch
import torch
import torch.nn as nn
import numpy as np
rnn = nn.GRU(10, 16, 2, batch_first=True)
input = torch.ones([3, 5, 10], dtype=torch.float32)
h0 = torch.ones([1 * 2, 3, 16], dtype=torch.float32)
output, hn = rnn(input, h0)
output = output.detach().numpy()
print(output.shape)
# (3, 5, 16)
# MindSpore
import mindspore
from mindspore import Tensor, nn
net = nn.GRU(10, 16, 2, batch_first=True)
x = Tensor(np.ones([3, 5, 10]), mindspore.float32)
h0 = Tensor(np.ones([1 * 2, 3, 16]), mindspore.float32)
output, hn = net(x, h0)
print(output.shape)
# (3, 5, 16)