mindspore.nn.CTCLoss
- class mindspore.nn.CTCLoss(blank=0, reduction='mean', zero_infinity=False)[source]
Calculates the CTC (Connectionist Temporal Classification) loss.
For the CTC algorithm, refer to Connectionist Temporal Classification: Labeling Unsegmented Sequence Data with Recurrent Neural Networks .
- Parameters
- Inputs:
log_probs (Tensor) - A tensor of shape (T, N, C) or (T, C), where T is input length, N is batch size and C is number of classes (including blank). T, N and C are positive integers.
targets (Tensor) - A tensor of shape (N, S) or (sum( target_lengths )), where S is max target length, means the target sequences.
input_lengths (Union[tuple, Tensor, int]) - A tuple or Tensor of shape(N), or a number. It means the lengths of the input.
target_lengths (Union[tuple, Tensor, int]) - A tuple or Tensor of shape(N), or a number. It means the lengths of the target.
- Outputs:
neg_log_likelihood (Tensor) - A loss value which is differentiable with respect to each input node.
- Raises
TypeError – If zero_infinity is not a bool, reduction is not string.
TypeError – If the dtype of log_probs is not float or double.
TypeError – If the dtype of targets, input_lengths or target_lengths is not int32 or int64.
ValueError – If reduction is not “none”, “mean” or “sum”.
ValueError – If the types of targets, input_lengths or target_lengths are different.
ValueError – If the value of blank is not in range [0, C). C is number of classes of log_probs .
ValueError – If any value of input_lengths is larger than C. C is number of classes of log_probs .
ValueError – If any target_lengths[i] is not in range [0, input_length[i]].
- Supported Platforms:
Ascend
CPU
Examples
>>> import numpy as np >>> from mindspore import Tensor >>> from mindspore import dtype as mstype >>> from mindspore.nn.loss import CTCLoss >>> T = 5 # Input sequence length >>> C = 2 # Number of classes >>> N = 2 # Batch size >>> S = 3 # Target sequence length of longest target in batch (padding length) >>> S_min = 2 # Minimum target length, for demonstration purposes >>> arr = np.arange(T*N*C).reshape((T, N, C)) >>> ms_input = Tensor(arr, dtype=mstype.float32) >>> input_lengths = np.full(shape=(N), fill_value=T) >>> input_lengths = Tensor(input_lengths, dtype=mstype.int32) >>> target_lengths = np.full(shape=(N), fill_value=S_min) >>> target_lengths = Tensor(target_lengths, dtype=mstype.int32) >>> target = np.random.randint(1, C, size=(N, S)) >>> target = Tensor(target, dtype=mstype.int32) >>> ctc_loss = CTCLoss(blank=0, reduction='none', zero_infinity=False) >>> loss = ctc_loss(ms_input, target, input_lengths, target_lengths) >>> print(loss) Tensor(shape=[2], dtype=Float32, value= [-4.57949715e+001, -5.57949677e+001]) >>> arr = np.arange(T*C).reshape((T, C)) >>> ms_input = Tensor(arr, dtype=mstype.float32) >>> input_lengths = T >>> target_lengths = S_min >>> target = np.random.randint(1, C, size=(S_min,)) >>> target = Tensor(target, dtype=mstype.int32) >>> ctc_loss = CTCLoss(blank=0, reduction='none', zero_infinity=False) >>> loss = ctc_loss(ms_input, target, input_lengths, target_lengths) >>> print(loss) Tensor(shape=[1], dtype=Float32, value= [-2.57949677e+001])