mindspore_xai.benchmark

Predefined XAI metrics.

class mindspore_xai.benchmark.ClassSensitivity[source]

Class sensitivity metric used to evaluate attribution-based explainers.

Reasonable atrribution-based explainers are expected to generate distinct saliency maps for different labels, especially for labels of highest confidence and low confidence. ClassSensitivity evaluates the explainer through computing the correlation between saliency maps of highest-confidence and lowest-confidence labels. Explainer with better class sensitivity will receive lower correlation score. To make the evaluation results intuitive, the returned score will take negative on correlation and normalize.

Supported Platforms:: Ascend GPU

evaluate(explainer, inputs)[source]

Evaluate class sensitivity on the explainer.

Note

Currently only single sample (\(N=1\)) at each call is supported.

Parameters

explainer (Explainer) – The explainer to be evaluated, see mindspore_xai.explainer.
inputs (Tensor) – A data sample, a 4D tensor of shape \((N, C, H, W)\).

Returns

numpy.ndarray, 1D array of shape \((N,)\), result of class sensitivity evaluated on explainer.

Raises

TypeError – Be raised for any argument type problem.
ValueError – Be raised if \(N\) is not 1.

Examples

>>> import numpy as np
>>> import mindspore as ms
>>> from mindspore import set_context, PYNATIVE_MODE
>>> from mindspore_xai.benchmark import ClassSensitivity
>>> from mindspore_xai.explainer import Gradient
>>>
>>> set_context(mode=PYNATIVE_MODE)
>>> # The detail of LeNet5 is shown in model_zoo.official.cv.lenet.src.lenet.py
>>> net = LeNet5(10, num_channel=3)
>>> # prepare your explainer to be evaluated, e.g., Gradient.
>>> gradient = Gradient(net)
>>> input_x = ms.Tensor(np.random.rand(1, 3, 32, 32), ms.float32)
>>> class_sensitivity = ClassSensitivity()
>>> res = class_sensitivity.evaluate(gradient, input_x)
>>> print(res.shape)
(1,)

class mindspore_xai.benchmark.Faithfulness(num_labels, activation_fn, metric='NaiveFaithfulness')[source]

Provides evaluation on faithfulness on XAI explanations.

Three specific metrics to obtain quantified results are supported: NaiveFaithfulness, DeletionAUC, and InsertionAUC.

For metric NaiveFaithfulness, a series of perturbed images are created by modifying pixels on original image. Then the perturbed images will be fed to the model and a series of output probability drops can be obtained. The faithfulness is then quantified as the correlation between the probability drops and the saliency map values on the same pixels (we normalize the correlation further to make them in range of [0, 1]).

For metric DeletionAUC, a series of perturbed images are created by accumulatively modifying pixels of the original image to a base value (e.g. a constant). The perturbation starts from pixels with high saliency values to pixels with low saliency values. Feeding the perturbed images into the model in order, an output probability drop curve can be obtained. DeletionAUC is then obtained as the area under this probability drop curve.

For metric InsertionAUC, a series of perturbed images are created by accumulatively inserting pixels of the original image to a reference image (e.g. a black image). The insertion starts from pixels with high saliency values to pixels with low saliency values. Feeding the perturbed images into the model in order, an output probability increase curve can be obtained. InsertionAUC is then obtained as the area under this curve.

For all the three metrics, higher value indicates better faithfulness.

Parameters

num_labels (int) – Number of labels.
activation_fn (Cell) – The activation layer that transforms logits to prediction probabilities. For single label classification tasks, nn.Softmax is usually applied. As for multi-label classification tasks, nn.Sigmoid is usually be applied. Users can also pass their own customized activation_fn as long as when combining this function with network, the final output is the probability of the input.
metric (str, optional) – The specific metric to quantify faithfulness. Options: "DeletionAUC", "InsertionAUC", "NaiveFaithfulness". Default: "NaiveFaithfulness".

Raises

TypeError – Be raised for any argument type problem.

Supported Platforms:: Ascend GPU

evaluate(explainer, inputs, targets, saliency=None)[source]

Evaluate faithfulness on the explainer.

Note

Currently only single sample (\(N=1\)) at each call is supported.

Parameters

explainer (Explainer) – The explainer to be evaluated, see mindspore_xai.explainer.
inputs (Tensor) – A data sample, a 4D tensor of shape \((N, C, H, W)\).
targets (Tensor, int) – The label of interest. It should be a 1D or scalar tensor, or an integer. If targets is a 1D tensor, its length should be \(N\).
saliency (Tensor, optional) – The saliency map to be evaluated, a 4D tensor of shape \((N, 1, H, W)\). If it is None, the parsed explainer will generate the saliency map with inputs and targets and continue the evaluation. Default: None.

Returns

numpy.ndarray, 1D array of shape \((N,)\), result of faithfulness evaluated on explainer.

Raises

TypeError – Be raised for any argument type problem.
ValueError – Be raised if \(N\) is not 1.

Examples

>>> import numpy as np
>>> import mindspore as ms
>>> from mindspore import nn, set_context, PYNATIVE_MODE
>>> from mindspore_xai.benchmark import Faithfulness
>>> from mindspore_xai.explainer import Gradient
>>>
>>> set_context(mode=PYNATIVE_MODE)
>>> # init a `Faithfulness` object
>>> num_labels = 10
>>> metric = "InsertionAUC"
>>> activation_fn = nn.Softmax()
>>> faithfulness = Faithfulness(num_labels, activation_fn, metric)
>>> # The detail of LeNet5 is shown in model_zoo.official.cv.lenet.src.lenet.py
>>> net = LeNet5(10, num_channel=3)
>>> gradient = Gradient(net)
>>> inputs = ms.Tensor(np.random.rand(1, 3, 32, 32), ms.float32)
>>> targets = 5
>>> # usage 1: input the explainer and the data to be explained,
>>> # faithfulness is a Faithfulness instance
>>> res = faithfulness.evaluate(gradient, inputs, targets)
>>> print(res.shape)
(1,)
>>> # usage 2: input the generated saliency map
>>> saliency = gradient(inputs, targets)
>>> res = faithfulness.evaluate(gradient, inputs, targets, saliency)
>>> print(res.shape)
(1,)

class mindspore_xai.benchmark.Localization(num_labels, metric='PointingGame')[source]

Provides evaluation on the localization capability of XAI methods.

Two specific metrics to obtain quantified results are supported: PointingGame, and IoSR (Intersection over Salient Region).

For metric PointingGame, the localization capability is calculated as the ratio of data in which the max position of their saliency maps lies within the bounding boxes. Specifically, for a single datum, given the saliency map and its bounding box, if the max point of its saliency map lies within the bounding box, the evaluation result is 1 otherwise 0.

For metric IoSR (Intersection over Salient Region), the localization capability is calculated as the intersection of the bounding box and the salient region over the area of the salient region. The salient region is defined as the region whose value exceeds \(\theta * \max{saliency}\).

Parameters

num_labels (int) – Number of classes in the dataset.
metric (str, optional) – Specific metric to calculate localization capability. Options: "PointingGame", "IoSR". Default: "PointingGame".

Raises

TypeError – Be raised for any argument type problem.

Supported Platforms:: Ascend GPU

evaluate(explainer, inputs, targets, saliency=None, mask=None)[source]

Evaluate localization on the explainer.

Note

Currently only single sample (\(N=1\)) at each call is supported.

Parameters

explainer (Explainer) – The explainer to be evaluated, see mindspore_xai.explainer.
inputs (Tensor) – A data sample, a 4D tensor of shape \((N, C, H, W)\).
targets (Tensor, int) – The label of interest. It should be a 1D or scalar tensor, or an integer. If targets is a 1D tensor, its length should be \(N\).
saliency (Tensor, optional) – The saliency map to be evaluated, a 4D tensor of shape \((N, 1, H, W)\). If it is None, the parsed explainer will generate the saliency map with inputs and targets and continue the evaluation. Default: None.
mask (Tensor, numpy.ndarray, optional) – Ground truth bounding box/masks for the inputs w.r.t targets, a 4D tensor or numpy.ndarray of shape \((N, 1, H, W)\). Default: None.

Returns

numpy.ndarray, 1D array of shape \((N,)\), result of localization evaluated on explainer.

Raises

TypeError – Be raised for any argument type problem.
ValueError – Be raised if \(N\) is not 1.

Examples

>>> import numpy as np
>>> import mindspore as ms
>>> from mindspore import set_context, PYNATIVE_MODE
>>> from mindspore_xai.explainer import Gradient
>>> from mindspore_xai.benchmark import Localization
>>>
>>> set_context(mode=PYNATIVE_MODE)
>>> num_labels = 10
>>> localization = Localization(num_labels, "PointingGame")
>>>
>>> # The detail of LeNet5 is shown in model_zoo.official.cv.lenet.src.lenet.py
>>> net = LeNet5(10, num_channel=3)
>>> gradient = Gradient(net)
>>> inputs = ms.Tensor(np.random.rand(1, 3, 32, 32), ms.float32)
>>> masks = np.zeros([1, 1, 32, 32])
>>> masks[:, :, 10: 20, 10: 20] = 1
>>> targets = 5
>>> # usage 1: input the explainer and the data to be explained,
>>> # localization is a Localization instance
>>> res = localization.evaluate(gradient, inputs, targets, mask=masks)
>>> print(res.shape)
(1,)
>>> # usage 2: input the generated saliency map
>>> saliency = gradient(inputs, targets)
>>> res = localization.evaluate(gradient, inputs, targets, saliency, mask=masks)
>>> print(res.shape)
(1,)

class mindspore_xai.benchmark.Robustness(num_labels, activation_fn)[source]

Robustness perturbs the inputs by adding random noise and choose the maximum sensitivity as evaluation score from the perturbations.

Parameters

num_labels (int) – Number of classes in the dataset.
activation_fn (Cell) – The activation layer that transforms logits to prediction probabilities. For single label classification tasks, nn.Softmax is usually applied. As for multi-label classification tasks, nn.Sigmoid is usually be applied. Users can also pass their own customized activation_fn as long as when combining this function with network, the final output is the probability of the input.

Raises

TypeError – Be raised for any argument type problem.

Supported Platforms:: Ascend GPU

evaluate(explainer, inputs, targets, saliency=None)[source]

Evaluate robustness on the explainer.

Note

Currently only single sample (\(N=1\)) at each call is supported.

Parameters

explainer (Explainer) – The explainer to be evaluated, see mindspore_xai.explainer.
inputs (Tensor) – A data sample, a 4D tensor of shape \((N, C, H, W)\).
targets (Tensor, int) – The label of interest. It should be a 1D or scalar tensor, or an integer. If targets is a 1D tensor, its length should be \(N\).
saliency (Tensor, optional) – The saliency map to be evaluated, a 4D tensor of shape \((N, 1, H, W)\). If it is None, the parsed explainer will generate the saliency map with inputs and targets and continue the evaluation. Default: None.

Returns

numpy.ndarray, 1D array of shape \((N,)\), result of robustness evaluated on explainer.

Raises

TypeError – Be raised for any argument type problem.
ValueError – Be raised if \(N\) is not 1.

Examples

>>> import numpy as np
>>> import mindspore as ms
>>> from mindspore import nn, set_context, PYNATIVE_MODE
>>> from mindspore_xai.explainer import Gradient
>>> from mindspore_xai.benchmark import Robustness
>>>
>>> set_context(mode=PYNATIVE_MODE)
>>> # Initialize a Robustness benchmarker passing num_labels of the dataset.
>>> num_labels = 10
>>> activation_fn = nn.Softmax()
>>> robustness = Robustness(num_labels, activation_fn)
>>>
>>> # The detail of LeNet5 is shown in model_zoo.official.cv.lenet.src.lenet.py
>>> net = LeNet5(10, num_channel=3)
>>> # prepare your explainer to be evaluated, e.g., Gradient.
>>> gradient = Gradient(net)
>>> input_x = ms.Tensor(np.random.rand(1, 3, 32, 32), ms.float32)
>>> target_label = ms.Tensor([0], ms.int32)
>>> # robustness is a Robustness instance
>>> res = robustness.evaluate(gradient, input_x, target_label)
>>> print(res.shape)
(1,)