mindspore.ops.kl_div
- mindspore.ops.kl_div(logits, labels, reduction='mean')[source]
Computes the Kullback-Leibler divergence between the logits and the labels.
For input tensors \(x\) and \(target\) with the same shape, the updating formulas of KLDivLoss algorithm are as follows,
\[L(x, target) = target \cdot (\log target - x)\]Then,
\[\begin{split}\ell(x, target) = \begin{cases} L(x, target), & \text{if reduction} = \text{'none';}\\ \operatorname{mean}(L(x, target)), & \text{if reduction} = \text{'mean';}\\ \operatorname{sum}(L(x, target)) / x.\operatorname{shape}[0], & \text{if reduction} = \text{'batchmean';}\\ \operatorname{sum}(L(x, target)), & \text{if reduction} = \text{'sum'.} \end{cases}\end{split}\]where \(x\) represents logits. \(target\) represents labels. \(\ell(x, target)\) represents output.
Note
Currently it does not support float64 input on Ascend.
The output aligns with the mathematical definition of Kullback-Leibler divergence only when reduction is set to
'batchmean'
.
- Parameters
logits (Tensor) – The input Tensor. The data type must be float16, float32 or float64.
labels (Tensor) – The label Tensor which has the same shape and data type as logits.
reduction (str) –
Specifies the reduction to be applied to the output. Its value must be one of
'none'
,'mean'
,'batchmean'
or'sum'
. Default:'mean'
.'none'
: no reduction will be applied.'mean'
: compute and return the mean of elements in the output.'sum'
: the output elements will be summed.'batchmean'
: the summed output elements divided by batch size.
- Returns
Tensor or Scalar, if reduction is
'none'
, then output is a tensor and has the same shape as logits. Otherwise, it is a scalar.- Raises
- Supported Platforms:
Ascend
GPU
CPU
Examples
>>> import mindspore >>> import numpy as np >>> from mindspore import Tensor, ops >>> logits = Tensor(np.array([0.2, 0.7, 0.1]), mindspore.float32) >>> labels = Tensor(np.array([0., 1., 0.]), mindspore.float32) >>> output = mindspore.ops.kl_div(logits, labels, 'mean') >>> print(output) -0.23333333