mindspore_xai.benchmark

Predefined XAI metrics.

class mindspore_xai.benchmark.ClassSensitivity[source]

Class sensitivity metric used to evaluate attribution-based explanations.

Reasonable atrribution-based explainers are expected to generate distinct saliency maps for different labels, especially for labels of highest confidence and low confidence. ClassSensitivity evaluates the explainer through computing the correlation between saliency maps of highest-confidence and lowest-confidence labels. Explainer with better class sensitivity will receive lower correlation score. To make the evaluation results intuitive, the returned score will take negative on correlation and normalize.

Supported Platforms:

Ascend GPU

evaluate(explainer, inputs)[source]

Evaluate class sensitivity on a single data sample.

Parameters
  • explainer (Explanation) – The explainer to be evaluated, see mindspore_xai.explanation.

  • inputs (Tensor) – A data sample, a 4D tensor of shape \((N, C, H, W)\).

Returns

numpy.ndarray, 1D array of shape \((N,)\), result of class sensitivity evaluated on explainer.

Raises

TypeError – Be raised for any argument type problem.

Examples

>>> import numpy as np
>>> import mindspore as ms
>>> from mindspore import context
>>> from mindspore_xai.benchmark import ClassSensitivity
>>> from mindspore_xai.explanation import Gradient
>>>
>>> context.set_context(mode=context.PYNATIVE_MODE)
>>> # The detail of LeNet5 is shown in model_zoo.official.cv.lenet.src.lenet.py
>>> net = LeNet5(10, num_channel=3)
>>> # prepare your explainer to be evaluated, e.g., Gradient.
>>> gradient = Gradient(net)
>>> input_x = ms.Tensor(np.random.rand(1, 3, 32, 32), ms.float32)
>>> class_sensitivity = ClassSensitivity()
>>> res = class_sensitivity.evaluate(gradient, input_x)
>>> print(res.shape)
(1,)
class mindspore_xai.benchmark.Faithfulness(num_labels, activation_fn, metric='NaiveFaithfulness')[source]

Provides evaluation on faithfulness on XAI explanations.

Three specific metrics to obtain quantified results are supported: “NaiveFaithfulness”, “DeletionAUC”, and “InsertionAUC”.

For metric “NaiveFaithfulness”, a series of perturbed images are created by modifying pixels on original image. Then the perturbed images will be fed to the model and a series of output probability drops can be obtained. The faithfulness is then quantified as the correlation between the propability drops and the saliency map values on the same pixels (we normalize the correlation further to make them in range of [0, 1]).

For metric “DeletionAUC”, a series of perturbed images are created by accumulatively modifying pixels of the original image to a base value (e.g. a constant). The perturbation starts from pixels with high saliency values to pixels with low saliency values. Feeding the perturbed images into the model in order, an output probability drop curve can be obtained. “DeletionAUC” is then obtained as the area under this probability drop curve.

For metric “InsertionAUC”, a series of perturbed images are created by accumulatively inserting pixels of the original image to a reference image (e.g. a black image). The insertion starts from pixels with high saliency values to pixels with low saliency values. Feeding the perturbed images into the model in order, an output probability increase curve can be obtained. “InsertionAUC” is then obtained as the area under this curve.

For all the three metrics, higher value indicates better faithfulness.

Parameters
  • num_labels (int) – Number of labels.

  • activation_fn (Cell) – The activation layer that transforms logits to prediction probabilities. For single label classification tasks, nn.Softmax is usually applied. As for multi-label classification tasks, nn.Sigmoid is usually be applied. Users can also pass their own customized activation_fn as long as when combining this function with network, the final output is the probability of the input.

  • metric (str, optional) – The specifi metric to quantify faithfulness. Options: “DeletionAUC”, “InsertionAUC”, “NaiveFaithfulness”. Default: ‘NaiveFaithfulness’.

Raises

TypeError – Be raised for any argument type problem.

Supported Platforms:

Ascend GPU

evaluate(explainer, inputs, targets, saliency=None)[source]

Evaluate faithfulness on a single data sample.

Note

Currently only single sample (\(N=1\)) at each call is supported.

Parameters
  • explainer (Explanation) – The explainer to be evaluated, see mindspore_xai.explanation.

  • inputs (Tensor) – A data sample, a 4D tensor of shape \((N, C, H, W)\).

  • targets (Tensor, int) – The label of interest. It should be a 1D or 0D tensor, or an integer. If targets is a 1D tensor, its length should be the same as inputs.

  • saliency (Tensor, optional) – The saliency map to be evaluated, a 4D tensor of shape \((N, 1, H, W)\). If it is None, the parsed explainer will generate the saliency map with inputs and targets and continue the evaluation. Default: None.

Returns

numpy.ndarray, 1D array of shape \((N,)\), result of faithfulness evaluated on explainer.

Examples

>>> import numpy as np
>>> import mindspore as ms
>>> from mindspore import context
>>> from mindspore import nn
>>> from mindspore_xai.benchmark import Faithfulness
>>> from mindspore_xai.explanation import Gradient
>>>
>>> context.set_context(mode=context.PYNATIVE_MODE)
>>> # init a `Faithfulness` object
>>> num_labels = 10
>>> metric = "InsertionAUC"
>>> activation_fn = nn.Softmax()
>>> faithfulness = Faithfulness(num_labels, activation_fn, metric)
>>> # The detail of LeNet5 is shown in model_zoo.official.cv.lenet.src.lenet.py
>>> net = LeNet5(10, num_channel=3)
>>> gradient = Gradient(net)
>>> inputs = ms.Tensor(np.random.rand(1, 3, 32, 32), ms.float32)
>>> targets = 5
>>> # usage 1: input the explainer and the data to be explained,
>>> # faithfulness is a Faithfulness instance
>>> res = faithfulness.evaluate(gradient, inputs, targets)
>>> print(res.shape)
(1,)
>>> # usage 2: input the generated saliency map
>>> saliency = gradient(inputs, targets)
>>> res = faithfulness.evaluate(gradient, inputs, targets, saliency)
>>> print(res.shape)
(1,)
class mindspore_xai.benchmark.Localization(num_labels, metric='PointingGame')[source]

Provides evaluation on the localization capability of XAI methods.

Three specific metrics to obtain quantified results are supported: “PointingGame”, and “IoSR” (Intersection over Salient Region).

For metric “PointingGame”, the localization capability is calculated as the ratio of data in which the max position of their saliency maps lies within the bounding boxes. Specifically, for a single datum, given the saliency map and its bounding box, if the max point of its saliency map lies within the bounding box, the evaluation result is 1 otherwise 0.

For metric “IoSR” (Intersection over Salient Region), the localization capability is calculated as the intersection of the bounding box and the salient region over the area of the salient region. The salient region is defined as the region whose value exceeds \(\theta * \max{saliency}\).

Parameters
  • num_labels (int) – Number of classes in the dataset.

  • metric (str, optional) – Specific metric to calculate localization capability. Options: “PointingGame”, “IoSR”. Default: “PointingGame”.

Raises

TypeError – Be raised for any argument type problem.

Supported Platforms:

Ascend GPU

evaluate(explainer, inputs, targets, saliency=None, mask=None)[source]

Evaluate localization on a single data sample.

Note

Currently only single sample (\(N=1\)) at each call is supported.

Parameters
  • explainer (Explanation) – The explainer to be evaluated, see mindspore_xai.explanation.

  • inputs (Tensor) – A data sample, a 4D tensor of shape \((N, C, H, W)\).

  • targets (Tensor, int) – The label of interest. It should be a 1D or 0D tensor, or an integer. If targets is a 1D tensor, its length should be the same as inputs.

  • saliency (Tensor, optional) – The saliency map to be evaluated, a 4D tensor of shape \((N, 1, H, W)\). If it is None, the parsed explainer will generate the saliency map with inputs and targets and continue the evaluation. Default: None.

  • mask (Tensor, numpy.ndarray) – Ground truth bounding box/masks for the inputs w.r.t targets, a 4D tensor or numpy.ndarray of shape \((N, 1, H, W)\).

Returns

numpy.ndarray, 1D array of shape \((N,)\), result of localization evaluated on explainer.

Raises

ValueError – Be raised for any argument value problem.

Examples

>>> import numpy as np
>>> import mindspore as ms
>>> from mindspore import context
>>> from mindspore_xai.explanation import Gradient
>>> from mindspore_xai.benchmark import Localization
>>>
>>> context.set_context(mode=context.PYNATIVE_MODE)
>>> num_labels = 10
>>> localization = Localization(num_labels, "PointingGame")
>>>
>>> # The detail of LeNet5 is shown in model_zoo.official.cv.lenet.src.lenet.py
>>> net = LeNet5(10, num_channel=3)
>>> gradient = Gradient(net)
>>> inputs = ms.Tensor(np.random.rand(1, 3, 32, 32), ms.float32)
>>> masks = np.zeros([1, 1, 32, 32])
>>> masks[:, :, 10: 20, 10: 20] = 1
>>> targets = 5
>>> # usage 1: input the explainer and the data to be explained,
>>> # localization is a Localization instance
>>> res = localization.evaluate(gradient, inputs, targets, mask=masks)
>>> print(res.shape)
(1,)
>>> # usage 2: input the generated saliency map
>>> saliency = gradient(inputs, targets)
>>> res = localization.evaluate(gradient, inputs, targets, saliency, mask=masks)
>>> print(res.shape)
(1,)
class mindspore_xai.benchmark.Robustness(num_labels, activation_fn)[source]

Robustness perturbs the inputs by adding random noise and choose the maximum sensitivity as evaluation score from the perturbations.

Parameters
  • num_labels (int) – Number of classes in the dataset.

  • activation_fn (Cell) – The activation layer that transforms logits to prediction probabilities. For single label classification tasks, nn.Softmax is usually applied. As for multi-label classification tasks, nn.Sigmoid is usually be applied. Users can also pass their own customized activation_fn as long as when combining this function with network, the final output is the probability of the input.

Raises

TypeError – Be raised for any argument type problem.

Supported Platforms:

Ascend GPU

evaluate(explainer, inputs, targets, saliency=None)[source]

Evaluate robustness on single sample.

Note

Currently only single sample (\(N=1\)) at each call is supported.

Parameters
  • explainer (Explanation) – The explainer to be evaluated, see mindspore_xai.explanation.

  • inputs (Tensor) – A data sample, a 4D tensor of shape \((N, C, H, W)\).

  • targets (Tensor, int) – The label of interest. It should be a 1D or 0D tensor, or an integer. If targets is a 1D tensor, its length should be the same as inputs.

  • saliency (Tensor, optional) – The saliency map to be evaluated, a 4D tensor of shape \((N, 1, H, W)\). If it is None, the parsed explainer will generate the saliency map with inputs and targets and continue the evaluation. Default: None.

Returns

numpy.ndarray, 1D array of shape \((N,)\), result of localization evaluated on explainer.

Raises

ValueError – If batch_size is larger than 1.

Examples

>>> import numpy as np
>>> import mindspore as ms
>>> from mindspore import context
>>> from mindspore import nn
>>> from mindspore_xai.explanation import Gradient
>>> from mindspore_xai.benchmark import Robustness
>>>
>>> context.set_context(mode=context.PYNATIVE_MODE)
>>> # Initialize a Robustness benchmarker passing num_labels of the dataset.
>>> num_labels = 10
>>> activation_fn = nn.Softmax()
>>> robustness = Robustness(num_labels, activation_fn)
>>>
>>> # The detail of LeNet5 is shown in model_zoo.official.cv.lenet.src.lenet.py
>>> net = LeNet5(10, num_channel=3)
>>> # prepare your explainer to be evaluated, e.g., Gradient.
>>> gradient = Gradient(net)
>>> input_x = ms.Tensor(np.random.rand(1, 3, 32, 32), ms.float32)
>>> target_label = ms.Tensor([0], ms.int32)
>>> # robustness is a Robustness instance
>>> res = robustness.evaluate(gradient, input_x, target_label)
>>> print(res.shape)
(1,)