mindspore.ops.KLDivLoss
- class mindspore.ops.KLDivLoss(reduction='mean')[source]
Computes the Kullback-Leibler divergence between the logits and the labels.
For tensors of the same shape \(x\) and \(target\), the updating formulas of KLDivLoss algorithm are as follows,
\[L(x, target) = target \cdot (\log target - x)\]Then,
\[\begin{split}\ell(x, target) = \begin{cases} L(x, target), & \text{if reduction} = \text{'none';}\\ \operatorname{mean}(L(x, target)), & \text{if reduction} = \text{'mean';}\\ \operatorname{sum}(L(x, target)) / x.\operatorname{shape}[0], & \text{if reduction} = \text{'batchmean';}\\ \operatorname{sum}(L(x, target)), & \text{if reduction} = \text{'sum'.} \end{cases}\end{split}\]where \(x\) represents logits, \(target\) represents labels, and \(\ell(x, target)\) represents output.
Note
On Ascend, float64 dtype is not currently supported.
The output aligns with the mathematical definition of Kullback-Leibler divergence only when reduction is set to ‘batchmean’.
- Parameters
reduction (str) –
Specifies the reduction to be applied to the output. Default:
'mean'.On Ascend, the value of reduction must be one of
'batchmean','none'or'sum'.On GPU, the value of reduction must be one of
'mean','none'or'sum'.On CPU, the value of reduction must be one of
'mean','batchmean','none'or'sum'.
- Inputs:
logits (Tensor) - The input Tensor. The data type must be float16, float32 or float64.
labels (Tensor) - The label Tensor which has the same shape and data type as logits.
- Outputs:
Tensor or Scalar, if reduction is ‘none’, then output is a tensor and has the same shape as logits. Otherwise it is a scalar.
- Raises
TypeError – If reduction is not a str.
TypeError – If neither logits nor labels is a Tensor.
TypeError – If dtype of logits or labels is not currently supported.
ValueError – If shape of logits is not the same as labels.
RuntimeError – If logits or labels is a scalar when reduction is ‘batchmean’.
- Supported Platforms:
AscendGPUCPU
Examples
>>> import mindspore >>> import numpy as np >>> from mindspore import Tensor, nn, ops >>> class Net(nn.Cell): ... def __init__(self): ... super(Net, self).__init__() ... self.kldiv_loss = ops.KLDivLoss(reduction='sum') ... def construct(self, logits, labels): ... result = self.kldiv_loss(logits, labels) ... return result ... >>> net = Net() >>> logits = Tensor(np.array([0.2, 0.7, 0.1]), mindspore.float32) >>> labels = Tensor(np.array([0., 1., 0.]), mindspore.float32) >>> output = net(logits, labels) >>> print(output) -0.7