mindspore.nn.LSTMCell

class mindspore.nn.LSTMCell(input_size: int, hidden_size: int, has_bias: bool = True, dtype=mstype.float32)[source]

A LSTM (Long Short-Term Memory) cell.

\begin{array}{r} \begin{array}{ll} i_{t} = σ (W_{i x} x_{t} + b_{i x} + W_{i h} h_{(t - 1)} + b_{i h}) \\ f_{t} = σ (W_{f x} x_{t} + b_{f x} + W_{f h} h_{(t - 1)} + b_{f h}) \\ {\tilde{c}}_{t} = \tanh (W_{c x} x_{t} + b_{c x} + W_{c h} h_{(t - 1)} + b_{c h}) \\ o_{t} = σ (W_{o x} x_{t} + b_{o x} + W_{o h} h_{(t - 1)} + b_{o h}) \\ c_{t} = f_{t} * c_{(t - 1)} + i_{t} * {\tilde{c}}_{t} \\ h_{t} = o_{t} * \tanh (c_{t}) \end{array} \end{array}

Here $σ$ is the sigmoid function, and $*$ is the Hadamard product. $W, b$ are learnable weights between the output and the input in the formula. For instance, $W_{i x}, b_{i x}$ are the weight and bias used to transform from input $x$ to $i$ . Details can be found in paper LONG SHORT-TERM MEMORY and Long Short-Term Memory Recurrent Neural Network Architectures for Large Scale Acoustic Modeling.

The encapsulated LSTMCell can be simplified to the following formula:

h^{^{'}}, c^{^{'}} = L S T M C e l l (x, (h_{0}, c_{0}))

Parameters

input_size (int) – Number of features of input.
hidden_size (int) – Number of features of hidden layer.
has_bias (bool) – Whether the cell has bias b_{ih} and b_{hh}. Default: True .
dtype (mindspore.dtype) – Dtype of Parameters. Default: mstype.float32 .

Inputs:

x (Tensor) - Tensor of shape $(b a t c h_s i z e, i n p u t_s i z e)$ .
hx (tuple) - A tuple of two Tensors (h_0, c_0) both of data type mindspore.float32 and shape $(b a t c h_s i z e, h i d d e n_s i z e)$ .

Outputs:

hx' (Tensor) - A tuple of two Tensors (h', c') both of data shape $(b a t c h_s i z e, h i d d e n_s i z e)$ .

Raises

TypeError – If input_size, hidden_size is not an int.
TypeError – If has_bias is not a bool.

Supported Platforms:: Ascend GPU CPU

Examples

>>> import mindspore as ms
>>> import numpy as np
>>> net = ms.nn.LSTMCell(10, 16)
>>> x = ms.Tensor(np.ones([5, 3, 10]).astype(np.float32))
>>> h = ms.Tensor(np.ones([3, 16]).astype(np.float32))
>>> c = ms.Tensor(np.ones([3, 16]).astype(np.float32))
>>> output = []
>>> for i in range(5):
...     hx = net(x[i], (h, c))
...     output.append(hx)
>>> print(output[0][0].shape)
(3, 16)