Differences with torch.nn.LSTM
torch.nn.LSTM
class torch.nn.LSTM(
input_size,
hidden_size,
num_layers=1,
bias=True,
batch_first=False,
dropout=0,
bidirectional=False,
proj_size=0)(input, (h_0, c_0)) -> Tensor
For more information, see torch.nn.LSTM.
mindspore.nn.LSTM
class mindspore.nn.LSTM(
input_size,
hidden_size,
num_layers=1,
has_bias=True,
batch_first=False,
dropout=0,
bidirectional=False)(x, hx, seq_length) -> Tensor
For more information, see mindspore.nn.LSTM.
Differences
PyTorch: Compute the output sequence and final state based on the input sequence and the given initial state.
MindSpore: If the proj_size parameter in PyTorch is not specified, the MindSpore API achieves the same functionality as PyTorch, with only some of the parameter names being different.
Categories |
Subcategories |
PyTorch |
MindSpore |
Difference |
---|---|---|---|---|
Parameters |
Parameter 1 |
input_size |
input_size |
- |
Parameter 2 |
hidden_size |
hidden_size |
- |
|
Parameter 3 |
num_layers |
num_layers |
- |
|
Parameter 4 |
bias |
has_bias |
Same function, different parameter names |
|
Parameter 5 |
batch_first |
batch_first |
- |
|
Parameter 6 |
dropout |
dropout |
- |
|
Parameter 7 |
bidirectional |
bidirectional |
- |
|
Parameter 8 |
proj_size |
- |
In PyTorch, if proj_size>0, the hidden_size in the output shape will become proj_size, and the default value is 0. MindSpore does not have this parameter |
|
Inputs |
Input 1 |
input |
x |
Same function, different parameter names |
Input 2 |
h_0 |
hx |
In MindSpore hx represents a tuple of two Tensor(h_0, c_0), corresponding to inputs 2 and 3 in PyTorch, with the same function |
|
Input 3 |
c_0 |
hx |
In MindSpore hx represents a tuple of two Tensor(h_0, c_0), corresponding to inputs 2 and 3 in PyTorch, with the same function |
|
Input 4 |
- |
seq_length |
This parameter in MindSpore specifies the sequence length of the input batch. PyTorch does not have this parameter |
Code Example
When the parameter proj_size in PyTorch takes the default value of 0, the two APIs achieve the same function and have the same usage.
# PyTorch
import torch
from torch import tensor
import numpy as np
rnn = torch.nn.LSTM(10, 16, 2, bias=True, batch_first=True, bidirectional=False)
input1 = tensor(np.ones([3, 5, 10]), dtype=torch.float32)
h0 = tensor(np.ones([1 * 2, 3, 16]), dtype=torch.float32)
c0 = tensor(np.ones([1 * 2, 3, 16]), dtype=torch.float32)
output, (hn, cn) = rnn(input1, (h0, c0))
print(output.detach().numpy().shape)
# (3, 5, 16)
print(hn.detach().numpy().shape)
# (2, 3, 16)
print(cn.detach().numpy().shape)
# (2, 3, 16)
# MindSpore
import mindspore
from mindspore import Tensor
import numpy as np
net = mindspore.nn.LSTM(10, 16, 2, has_bias=True, batch_first=True, bidirectional=False)
x = Tensor(np.ones([3, 5, 10]).astype(np.float32))
h0 = Tensor(np.ones([1 * 2, 3, 16]).astype(np.float32))
c0 = Tensor(np.ones([1 * 2, 3, 16]).astype(np.float32))
output, (hn, cn) = net(x, (h0, c0))
print(output.shape)
# (3, 5, 16)
print(hn.shape)
# (2, 3, 16)
print(cn.shape)
# (2, 3, 16)