mindspore.nn.LSTMCell
- class mindspore.nn.LSTMCell(input_size, hidden_size, has_bias=True, batch_first=False, dropout=0, bidirectional=False)[source]
LSTM (Long Short-Term Memory) layer.
Apply LSTM layer to the input.
There are two pipelines connecting two consecutive cells in a LSTM model; one is cell state pipeline and the other is hidden state pipeline. Denote two consecutive time nodes as
and . Given an input at time , an hidden state and an cell state of the layer at time , the cell state and hidden state at time is computed using an gating mechanism. Input gate is designed to protect the cell from perturbation by irrelevant inputs. Forget gate affords protection of the cell by forgetting some information in the past, which is stored in . Output gate protects other units from perturbation by currently irrelevant memory contents. Candidate cell state is calculated with the current input, on which the input gate will be applied. Finally, current cell state and hidden state are computed with the calculated gates and cell states. The complete formulation is as follows.Here
is the sigmoid function, and is the Hadamard product. are learnable weights between the output and the input in the formula. For instance, are the weight and bias used to transform from input to . Details can be found in paper LONG SHORT-TERM MEMORY and Long Short-Term Memory Recurrent Neural Network Architectures for Large Scale Acoustic Modeling.Note
LSTMCell is a single-layer RNN, you can achieve multi-layer RNN by stacking LSTMCell.
- Parameters
input_size (int) – Number of features of input.
hidden_size (int) – Number of features of hidden layer.
has_bias (bool) – Whether the cell has bias b_ih and b_hh. Default: True.
batch_first (bool) – Specifies whether the first dimension of input x is batch_size. Default: False.
dropout (float, int) – If not 0, append Dropout layer on the outputs of each LSTM layer except the last layer. Default 0. The range of dropout is [0.0, 1.0].
bidirectional (bool) – Specifies whether this is a bidirectional LSTM. If set True, number of directions will be 2 otherwise number of directions is 1. Default: False.
- Inputs:
x (Tensor) - Tensor of shape (seq_len, batch_size, input_size).
h - data type mindspore.float32 or mindspore.float16 and shape (num_directions, batch_size, hidden_size).
c - data type mindspore.float32 or mindspore.float16 and shape (num_directions, batch_size, hidden_size). Data type of h’ and ‘c’ must be the same of `x.
w - data type mindspore.float32 or mindspore.float16 and shape (weight_size, 1, 1). The value of weight_size depends on input_size, hidden_size and bidirectional
- Outputs:
output, h_n, c_n, ‘reserve’, ‘state’.
output (Tensor) - Tensor of shape (seq_len, batch_size, num_directions * hidden_size).
h - A Tensor with shape (num_directions, batch_size, hidden_size).
c - A Tensor with shape (num_directions, batch_size, hidden_size).
reserve - reserved
state - reserved
- Raises
TypeError – If input_size or hidden_size or num_layers is not an int.
TypeError – If has_bias or batch_first or bidirectional is not a bool.
TypeError – If dropout is neither a float nor an int.
ValueError – If dropout is not in range [0.0, 1.0].
- Supported Platforms:
GPU
CPU
Examples
>>> net = nn.LSTMCell(10, 12, has_bias=True, batch_first=True, bidirectional=False) >>> x = Tensor(np.ones([3, 5, 10]).astype(np.float32)) >>> h = Tensor(np.ones([1, 3, 12]).astype(np.float32)) >>> c = Tensor(np.ones([1, 3, 12]).astype(np.float32)) >>> w = Tensor(np.ones([1152, 1, 1]).astype(np.float32)) >>> output, h, c, _, _ = net(x, h, c, w) >>> print(output.shape) (3, 5, 12)