mindspore.mint.nn.functional.binary_cross_entropy_with_logits

View Source On Gitee
mindspore.mint.nn.functional.binary_cross_entropy_with_logits(input, target, weight=None, reduction='mean', pos_weight=None)[source]

Adds sigmoid activation function to input as logits, and uses this logits to compute binary cross entropy between the logits and the target. Consistent with the function of mindspore.ops.binary_cross_entropy_with_logits .

Sets input input as \(X\), input target as \(Y\), input weight as \(W\), output as \(L\). Then,

\[\begin{split}\begin{array}{ll} \\ p_{ij} = sigmoid(X_{ij}) = \frac{1}{1 + e^{-X_{ij}}} \\ L_{ij} = -[Y_{ij}log(p_{ij}) + (1 - Y_{ij})log(1 - p_{ij})] \end{array}\end{split}\]

\(i\) indicates the \(i^{th}\) sample, \(j\) indicates the category. Then,

\[\begin{split}\ell(x, y) = \begin{cases} L, & \text{if reduction} = \text{'none';}\\ \operatorname{mean}(L), & \text{if reduction} = \text{'mean';}\\ \operatorname{sum}(L), & \text{if reduction} = \text{'sum'.} \end{cases}\end{split}\]

\(\ell\) indicates the method of calculating the loss. There are three methods: the first method is to provide the loss value directly, the second method is to calculate the average value of all losses, and the third method is to calculate the sum of all losses.

This operator will multiply the output by the corresponding weight. The tensor \(weight\) assigns different weights to each piece of data in the batch, and the tensor \(pos\_weight\) adds corresponding weights to the positive examples of each category.

In addition, it can trade off recall and precision by adding weights to positive examples. In the case of multi-label classification the loss can be described as:

\[\begin{split}\begin{array}{ll} \\ p_{ij,c} = sigmoid(X_{ij,c}) = \frac{1}{1 + e^{-X_{ij,c}}} \\ L_{ij,c} = -[P_{c}Y_{ij,c} * log(p_{ij,c}) + (1 - Y_{ij,c})log(1 - p_{ij,c})] \end{array}\end{split}\]

where c is the class number (c>1 for multi-label binary classification, c=1 for single-label binary classification), n is the number of the sample in the batch and \(P_c\) is the weight of the positive answer for the class c. \(P_c>1\) increases the recall, \(P_c<1\) increases the precision.

Parameters
  • input (Tensor) – Input input with shape \((N, *)\) where \(*\) means, any number of additional dimensions. The data type must be float16, float32 or bfloat16(only Atlas A2 series products are supported).

  • target (Tensor) – Ground truth label, has the same shape as input. The data type must be float16, float32 or bfloat16(only Atlas A2 series products are supported).

  • weight (Tensor, optional) – A rescaling weight applied to the loss of each batch element. It can be broadcast to a tensor with shape of input. Data type must be float16, float32 or bfloat16(only Atlas A2 series products are supported). Default: None, weight is a Tensor whose value is 1.

  • reduction (str, optional) –

    Apply specific reduction method to the output: 'none' , 'mean' , 'sum' . Default: 'mean' .

    • 'none': no reduction will be applied.

    • 'mean': compute and return the weighted mean of elements in the output.

    • 'sum': the output elements will be summed.

  • pos_weight (Tensor, optional) – A weight of positive examples. Must be a vector with length equal to the number of classes. It can be broadcast to a tensor with shape of input. Data type must be float16, float32 or bfloat16(only Atlas A2 series products are supported). Default: None, it equals to pos_weight is a Tensor whose value is 1.

Returns

Tensor or Scalar, if reduction is 'none', it's a tensor with the same shape and type as input input. Otherwise, the output is a Scalar.

Raises
  • TypeError – If input input, target, weight, pos_weight is not Tensor.

  • TypeError – If data type of input reduction is not string.

  • ValueError – If weight or pos_weight can not be broadcast to a tensor with shape of input.

  • ValueError – If reduction is not one of 'none', 'mean' or 'sum'.

Supported Platforms:

Ascend

Examples

>>> import mindspore
>>> import numpy as np
>>> from mindspore import Tensor, mint
>>> input = Tensor(np.array([[-0.8, 1.2, 0.7], [-0.1, -0.4, 0.7]]), mindspore.float32)
>>> target = Tensor(np.array([[0.3, 0.8, 1.2], [-0.6, 0.1, 2.2]]), mindspore.float32)
>>> weight = Tensor(np.array([1.0, 1.0, 1.0]), mindspore.float32)
>>> pos_weight = Tensor(np.array([1.0, 1.0, 1.0]), mindspore.float32)
>>> output = mint.nn.functional.binary_cross_entropy_with_logits(input, target, weight, 'mean', pos_weight)
>>> print(output)
0.3463612