mindspore.ops.CTCLossV2
- class mindspore.ops.CTCLossV2(blank=0, reduction='none', zero_infinity=False)[source]
Calculates the CTC (Connectionist Temporal Classification) loss and the gradient.
The CTC algorithm is proposed in Connectionist Temporal Classification: Labeling Unsegmented Sequence Data with Recurrent Neural Networks.
Warning
This is an experimental API that is subject to change or deletion.
- Parameters
blank (int, optional) – The blank label. Default:
0
.reduction (str, optional) – Apply specific reduction method to the output. Currently only support
'none'
. Default:'none'
.zero_infinity (bool, optional) – If loss is infinite, this parameter determines whether to set that loss and its correlated gradient to zero. Default:
False
.
- Inputs:
log_probs (Tensor) - A tensor of shape \((T, N, C)\), where \(T\) is input length, \(N\) is batch size and \(C\) is number of classes (including blank). Supported dtypes: float32, float64.
targets (Tensor) - A tensor of shape \((N, S)\), where \(S\) is max target length, means the target sequences. Supported dtypes: int32, int64.
input_lengths (Union(Tuple, Tensor)) - A tuple or Tensor of shape \((N)\). It means the lengths of the input. Supported dtypes: int32, int64.
target_lengths (Union(Tuple, Tensor)) - A tuple or Tensor of shape \((N)\). It means the lengths of the target. Supported dtypes: int32, int64.
- Outputs:
neg_log_likelihood (Tensor) - A loss value which is differentiable with respect to each input node.
log_alpha (Tensor) - The probability of possible trace of input to target.
- Raises
TypeError – If zero_infinity is not a bool.
TypeError – If reduction is not string.
TypeError – If the dtype of log_probs is not float or double.
TypeError – If the dtype of targets, input_lengths or target_lengths is not int32 or int64.
ValueError – If the rank of log_probs is not 3.
ValueError – If the rank of targets is not 2.
ValueError – If the shape of input_lengths does not match batch_size \(N\).
ValueError – If the shape of target_lengths does not match batch_size \(N\).
TypeError – If the types of targets, input_lengths or target_lengths are different.
ValueError – If the value of blank is not in range [0, C).
RuntimeError – If any value of input_lengths is larger than (num_labels|C).
RuntimeError – If any target_lengths[i] is not in range [0, input_length[i]].
- Supported Platforms:
Ascend
GPU
CPU
Examples
>>> import numpy as np >>> from mindspore import Tensor, ops >>> from mindspore import dtype as mstype >>> log_probs = Tensor(np.array([[[0.3, 0.6, 0.6]], ... [[0.9, 0.4, 0.2]]]).astype(np.float32)) >>> targets = Tensor(np.array([[0, 1]]), mstype.int32) >>> input_lengths = Tensor(np.array([2]), mstype.int32) >>> target_lengths = Tensor(np.array([1]), mstype.int32) >>> CTCLossV2 = ops.CTCLossV2(blank=0, reduction='none', zero_infinity=False) >>> neg_log_hood, log_alpha = CTCLossV2( ... log_probs, targets, input_lengths, target_lengths) >>> print(neg_log_hood) [-2.2986124] >>> print(log_alpha) [[[0.3 0.3 -inf -inf -inf] [1.2 1.8931472 1.2 -inf -inf]]]