mindarmour.adv_robustness.attacks

This module includes classical black-box and white-box attack algorithms in making adversarial examples.

class mindarmour.adv_robustness.attacks.BasicIterativeMethod(network, eps=0.3, eps_iter=0.1, bounds=(0.0, 1.0), is_targeted=False, nb_iter=5, loss_fn=None)[source]

The Basic Iterative Method attack, an iterative FGSM method to generate adversarial examples.

References: A. Kurakin, I. Goodfellow, and S. Bengio, “Adversarial examples in the physical world,” in ICLR, 2017

Parameters
  • network (Cell) – Target model.

  • eps (float) – Proportion of adversarial perturbation generated by the attack to data range. Default: 0.3.

  • eps_iter (float) – Proportion of single-step adversarial perturbation generated by the attack to data range. Default: 0.1.

  • bounds (tuple) – Upper and lower bounds of data, indicating the data range. In form of (clip_min, clip_max). Default: (0.0, 1.0).

  • is_targeted (bool) – If True, targeted attack. If False, untargeted attack. Default: False.

  • nb_iter (int) – Number of iteration. Default: 5.

  • loss_fn (Loss) – Loss function for optimization. If None, the input network is already equipped with loss function. Default: None.

Examples

>>> from mindspore.ops import operations as P
>>> from mindarmour.adv_robustness.attacks import BasicIterativeMethod
>>> class Net(nn.Cell):
...     def __init__(self):
...         super(Net, self).__init__()
...         self._softmax = P.Softmax()
...     def construct(self, inputs):
...         out = self._softmax(inputs)
...         return out
>>> net = Net()
>>> attack = BasicIterativeMethod(net, loss_fn=nn.SoftmaxCrossEntropyWithLogits(sparse=False))
>>> inputs = np.asarray([[0.1, 0.2, 0.7]], np.float32)
>>> labels = np.asarray([2],np.int32)
>>> labels = np.eye(3)[labels].astype(np.float32)
>>> net = Net()
>>> adv_x = attack.generate(inputs, labels)
generate(inputs, labels)[source]

Simple iterative FGSM method to generate adversarial examples.

Parameters
  • inputs (Union[numpy.ndarray, tuple]) – Benign input samples used as references to create adversarial examples.

  • labels (Union[numpy.ndarray, tuple]) – Original/target labels. For each input if it has more than one label, it is wrapped in a tuple.

Returns

numpy.ndarray, generated adversarial examples.

class mindarmour.adv_robustness.attacks.CarliniWagnerL2Attack(network, num_classes, box_min=0.0, box_max=1.0, bin_search_steps=5, max_iterations=1000, confidence=0, learning_rate=0.005, initial_const=0.01, abort_early_check_ratio=0.05, targeted=False, fast=True, abort_early=True, sparse=True)[source]

The Carlini & Wagner attack using L2 norm generates the adversarial examples by utilizing two separate losses: an adversarial loss to make the generated example actually adversarial, and a distance loss to constraint the quality of the adversarial example.

References: Nicholas Carlini, David Wagner: “Towards Evaluating the Robustness of Neural Networks”

Parameters
  • network (Cell) – Target model.

  • num_classes (int) – Number of labels of model output, which should be greater than zero.

  • box_min (float) – Lower bound of input of the target model. Default: 0.

  • box_max (float) – Upper bound of input of the target model. Default: 1.0.

  • bin_search_steps (int) – The number of steps for the binary search used to find the optimal trade-off constant between distance and confidence. Default: 5.

  • max_iterations (int) – The maximum number of iterations, which should be greater than zero. Default: 1000.

  • confidence (float) – Confidence of the output of adversarial examples. Default: 0.

  • learning_rate (float) – The learning rate for the attack algorithm. Default: 5e-3.

  • initial_const (float) – The initial trade-off constant to use to balance the relative importance of perturbation norm and confidence difference. Default: 1e-2.

  • abort_early_check_ratio (float) – Check loss progress every ratio of all iteration. Default: 5e-2.

  • targeted (bool) – If True, targeted attack. If False, untargeted attack. Default: False.

  • fast (bool) – If True, return the first found adversarial example. If False, return the adversarial samples with smaller perturbations. Default: True.

  • abort_early (bool) – If True, Adam will be aborted if the loss hasn’t decreased for some time. If False, Adam will continue work until the max iterations is arrived. Default: True.

  • sparse (bool) – If True, input labels are sparse-coded. If False, input labels are onehot-coded. Default: True.

Examples

>>> import mindspore.ops.operations as P
>>> from mindarmour.adv_robustness.attacks import CarliniWagnerL2Attack
>>> class Net(nn.Cell):
...     def __init__(self):
...         super(Net, self).__init__()
...         self._softmax = P.Softmax()
...     def construct(self, inputs):
...         out = self._softmax(inputs)
...         return out
>>> net = Net()
>>> input_np = np.array([[0.1, 0.2, 0.7, 0.5, 0.4]]).astype(np.float32)
>>> num_classes = input_np.shape[1]
>>> label_np = np.array([3]).astype(np.int64)
>>> attack = CarliniWagnerL2Attack(net, num_classes, targeted=False)
>>> adv_data = attack.generate(input_np, label_np)
generate(inputs, labels)[source]

Generate adversarial examples based on input data and targeted labels.

Parameters
Returns

numpy.ndarray, generated adversarial examples.

class mindarmour.adv_robustness.attacks.DeepFool(network, num_classes, model_type='classification', reserve_ratio=0.3, max_iters=50, overshoot=0.02, norm_level=2, bounds=None, sparse=True)[source]

DeepFool is an untargeted & iterative attack achieved by moving the benign sample to the nearest classification boundary and crossing the boundary.

Reference: DeepFool: a simple and accurate method to fool deep neural networks

Parameters
  • network (Cell) – Target model.

  • num_classes (int) – Number of labels of model output, which should be greater than zero.

  • model_type (str) – Tye type of targeted model. ‘classification’ and ‘detection’ are supported now. default: ‘classification’.

  • reserve_ratio (Union[int, float]) – The percentage of objects that can be detected after attaks, specifically for model_type=’detection’. Reserve_ratio should be in the range of (0, 1). Default: 0.3.

  • max_iters (int) – Max iterations, which should be greater than zero. Default: 50.

  • overshoot (float) – Overshoot parameter. Default: 0.02.

  • norm_level (Union[int, str]) – Order of the vector norm. Possible values: np.inf or 2. Default: 2.

  • bounds (Union[tuple, list]) – Upper and lower bounds of data range. In form of (clip_min, clip_max). Default: None.

  • sparse (bool) – If True, input labels are sparse-coded. If False, input labels are onehot-coded. Default: True.

Examples

>>> import mindspore.ops.operations as P
>>> from mindspore import Tensor
>>> from mindarmour.adv_robustness.attacks import DeepFool
>>> class Net(nn.Cell):
...     def __init__(self):
...         super(Net, self).__init__()
...         self._softmax = P.Softmax()
...     def construct(self, inputs):
...         out = self._softmax(inputs)
...         return out
>>> net = Net()
>>> input_shape = (1, 5)
>>> _, classes = input_shape
>>> attack = DeepFool(net, classes, max_iters=10, norm_level=2,
...                   bounds=(0.0, 1.0))
>>> input_np = np.array([[0.1, 0.2, 0.7, 0.5, 0.4]]).astype(np.float32)
>>> input_me = Tensor(input_np)
>>> true_labels = np.argmax(net(input_me).asnumpy(), axis=1)
>>> advs = attack.generate(input_np, true_labels)
generate(inputs, labels)[source]

Generate adversarial examples based on input samples and original labels.

Parameters
  • inputs (Union[numpy.ndarray, tuple]) – Input samples. The format of inputs should be numpy.ndarray if model_type=’classification’. The format of inputs can be (input1, input2, …) or only one array if model_type=’detection’.

  • labels (Union[numpy.ndarray, tuple]) – Targeted labels or ground-truth labels. The format of labels should be numpy.ndarray if model_type=’classification’. The format of labels should be (gt_boxes, gt_labels) if model_type=’detection’.

Returns

numpy.ndarray, adversarial examples.

Raises

NotImplementedError – If norm_level is not in [2, np.inf, ‘2’, ‘inf’].

class mindarmour.adv_robustness.attacks.DiverseInputIterativeMethod(network, eps=0.3, bounds=(0.0, 1.0), is_targeted=False, prob=0.5, loss_fn=None)[source]

The Diverse Input Iterative Method attack follows the basic iterative method, and applies random transformation to the input data at each iteration. Such transformation on the input data could improve the transferability of the adversarial examples.

References: Xie, Cihang and Zhang, et al., “Improving Transferability of Adversarial Examples With Input Diversity,” in CVPR, 2019

Parameters
  • network (Cell) – Target model.

  • eps (float) – Proportion of adversarial perturbation generated by the attack to data range. Default: 0.3.

  • bounds (tuple) – Upper and lower bounds of data, indicating the data range. In form of (clip_min, clip_max). Default: (0.0, 1.0).

  • is_targeted (bool) – If True, targeted attack. If False, untargeted attack. Default: False.

  • prob (float) – Transformation probability. Default: 0.5.

  • loss_fn (Loss) – Loss function for optimization. If None, the input network is already equipped with loss function. Default: None.

Examples

>>> from mindspore.ops import operations as P
>>> from mindarmour.adv_robustness.attacks import DiverseInputIterativeMethod
>>> class Net(nn.Cell):
...     def __init__(self):
...         super(Net, self).__init__()
...         self._softmax = P.Softmax()
...     def construct(self, inputs):
...         out = self._softmax(inputs)
...         return out
>>> net = Net()
>>> attack = DiverseInputIterativeMethod(net, loss_fn=nn.SoftmaxCrossEntropyWithLogits(sparse=False))
>>> inputs = np.asarray([[0.1, 0.2, 0.7]], np.float32)
>>> labels = np.asarray([2],np.int32)
>>> labels = np.eye(3)[labels].astype(np.float32)
>>> net = Net()
>>> adv_x = attack.generate(inputs, labels)
class mindarmour.adv_robustness.attacks.FastGradientMethod(network, eps=0.07, alpha=None, bounds=(0.0, 1.0), norm_level=2, is_targeted=False, loss_fn=None)[source]

This attack is a one-step attack based on gradients calculation, and the norm of perturbations includes L1, L2 and Linf.

References: I. J. Goodfellow, J. Shlens, and C. Szegedy, “Explaining and harnessing adversarial examples,” in ICLR, 2015.

Parameters
  • network (Cell) – Target model.

  • eps (float) – Proportion of single-step adversarial perturbation generated by the attack to data range. Default: 0.07.

  • alpha (float) – Proportion of single-step random perturbation to data range. Default: None.

  • bounds (tuple) – Upper and lower bounds of data, indicating the data range. In form of (clip_min, clip_max). Default: (0.0, 1.0).

  • norm_level (Union[int, numpy.inf]) – Order of the norm. Possible values: np.inf, 1 or 2. Default: 2.

  • is_targeted (bool) – If True, targeted attack. If False, untargeted attack. Default: False.

  • loss_fn (Loss) – Loss function for optimization. If None, the input network is already equipped with loss function. Default: None.

Examples

>>> from mindarmour.adv_robustness.attacks import FastGradientMethod
>>> class Net(nn.Cell):
...     def __init__(self):
...         super(Net, self).__init__()
...         self._relu = nn.ReLU()
...     def construct(self, inputs):
...         out = self._relu(inputs)
...         return out
>>> inputs = np.asarray([[0.1, 0.2, 0.7]], np.float32)
>>> labels = np.asarray([2],np.int32)
>>> labels = np.eye(3)[labels].astype(np.float32)
>>> net = Net()
>>> attack = FastGradientMethod(net, loss_fn=nn.SoftmaxCrossEntropyWithLogits(sparse=False))
>>> adv_x = attack.generate(inputs, labels)
class mindarmour.adv_robustness.attacks.FastGradientSignMethod(network, eps=0.07, alpha=None, bounds=(0.0, 1.0), is_targeted=False, loss_fn=None)[source]

The Fast Gradient Sign Method attack calculates the gradient of the input data, and then uses the sign of the gradient to create adversarial noises.

References: Ian J. Goodfellow, J. Shlens, and C. Szegedy, “Explaining and harnessing adversarial examples,” in ICLR, 2015

Parameters
  • network (Cell) – Target model.

  • eps (float) – Proportion of single-step adversarial perturbation generated by the attack to data range. Default: 0.07.

  • alpha (float) – Proportion of single-step random perturbation to data range. Default: None.

  • bounds (tuple) – Upper and lower bounds of data, indicating the data range. In form of (clip_min, clip_max). Default: (0.0, 1.0).

  • is_targeted (bool) – If True, targeted attack. If False, untargeted attack. Default: False.

  • loss_fn (Loss) – Loss function for optimization. If None, the input network is already equipped with loss function. Default: None.

Examples

>>> from mindarmour.adv_robustness.attacks import FastGradientSignMethod
>>> class Net(nn.Cell):
...     def __init__(self):
...         super(Net, self).__init__()
...         self._relu = nn.ReLU()
...     def construct(self, inputs):
...         out = self._relu(inputs)
...         return out
>>> net = Net()
>>> inputs = np.asarray([[0.1, 0.2, 0.7]], np.float32)
>>> labels = np.asarray([2],np.int32)
>>> labels = np.eye(3)[labels].astype(np.float32)
>>> attack = FastGradientSignMethod(net, loss_fn=nn.SoftmaxCrossEntropyWithLogits(sparse=False))
>>> adv_x = attack.generate(inputs, labels)
class mindarmour.adv_robustness.attacks.GeneticAttack(model, model_type='classification', targeted=True, reserve_ratio=0.3, sparse=True, pop_size=6, mutation_rate=0.005, per_bounds=0.15, max_steps=1000, step_size=0.2, temp=0.3, bounds=(0, 1.0), adaptive=False, c=0.1)[source]

The Genetic Attack represents the black-box attack based on the genetic algorithm, which belongs to differential evolution algorithms.

This attack was proposed by Moustafa Alzantot et al. (2018).

References: Moustafa Alzantot, Yash Sharma, Supriyo Chakraborty, “GeneticAttack: Practical Black-box Attacks with Gradient-FreeOptimization”

Parameters
  • model (BlackModel) – Target model.

  • model_type (str) – The type of targeted model. ‘classification’ and ‘detection’ are supported now. default: ‘classification’.

  • targeted (bool) – If True, turns on the targeted attack. If False, turns on untargeted attack. It should be noted that only untargeted attack is supported for model_type=’detection’, Default: True.

  • reserve_ratio (Union[int, float]) – The percentage of objects that can be detected after attacks, specifically for model_type=’detection’. Reserve_ratio should be in the range of (0, 1). Default: 0.3.

  • pop_size (int) – The number of particles, which should be greater than zero. Default: 6.

  • mutation_rate (Union[int, float]) – The probability of mutations, which should be in the range of (0, 1). Default: 0.005.

  • per_bounds (Union[int, float]) – Maximum L_inf distance.

  • max_steps (int) – The maximum round of iteration for each adversarial example. Default: 1000.

  • step_size (Union[int, float]) – Attack step size. Default: 0.2.

  • temp (Union[int, float]) – Sampling temperature for selection. Default: 0.3. The greater the temp, the greater the differences between individuals’ selecting probabilities.

  • bounds (Union[tuple, list, None]) – Upper and lower bounds of data. In form of (clip_min, clip_max). Default: (0, 1.0).

  • adaptive (bool) – If True, turns on dynamic scaling of mutation parameters. If false, turns on static mutation parameters. Default: False.

  • sparse (bool) – If True, input labels are sparse-encoded. If False, input labels are one-hot-encoded. Default: True.

  • c (Union[int, float]) – Weight of perturbation loss. Default: 0.1.

Examples

>>> import mindspore.ops.operations as M
>>> from mindspore import Tensor
>>> from mindspore.nn import Cell
>>> from mindarmour import BlackModel
>>> from mindarmour.adv_robustness.attacks import GeneticAttack
>>> class ModelToBeAttacked(BlackModel):
...     def __init__(self, network):
...         super(ModelToBeAttacked, self).__init__()
...         self._network = network
...     def predict(self, inputs):
...         result = self._network(Tensor(inputs.astype(np.float32)))
...         return result.asnumpy()
>>> class Net(Cell):
...     def __init__(self):
...         super(Net, self).__init__()
...         self._softmax = M.Softmax()
...     def construct(self, inputs):
...         out = self._softmax(inputs)
...         return out
>>> net = Net()
>>> model = ModelToBeAttacked(net)
>>> attack = GeneticAttack(model, sparse=False)
>>> batch_size = 6
>>> x_test = np.random.rand(batch_size, 10)
>>> y_test = np.random.randint(low=0, high=10, size=batch_size)
>>> y_test = np.eye(10)[y_test]
>>> y_test = y_test.astype(np.float32)
>>> _, adv_data, _ = attack.generate(x_test, y_test)
generate(inputs, labels)[source]

Generate adversarial examples based on input data and targeted labels (or ground_truth labels).

Parameters
  • inputs (Union[numpy.ndarray, tuple]) – Input samples. The format of inputs should be numpy.ndarray if model_type=’classification’. The format of inputs can be (input1, input2, …) or only one array if model_type=’detection’.

  • labels (Union[numpy.ndarray, tuple]) – Targeted labels or ground-truth labels. The format of labels should be numpy.ndarray if model_type=’classification’. The format of labels should be (gt_boxes, gt_labels) if model_type=’detection’.

Returns

  • numpy.ndarray, bool values for each attack result.

  • numpy.ndarray, generated adversarial examples.

  • numpy.ndarray, query times for each sample.

class mindarmour.adv_robustness.attacks.HopSkipJumpAttack(model, init_num_evals=100, max_num_evals=1000, stepsize_search='geometric_progression', num_iterations=20, gamma=1.0, constraint='l2', batch_size=32, clip_min=0.0, clip_max=1.0, sparse=True)[source]

HopSkipJumpAttack proposed by Chen, Jordan and Wainwright is a decision-based attack. The attack requires access to output labels of target model.

References: Chen J, Michael I. Jordan, Martin J. Wainwright. HopSkipJumpAttack: A Query-Efficient Decision-Based Attack. 2019. arXiv:1904.02144

Parameters
  • model (BlackModel) – Target model.

  • init_num_evals (int) – The initial number of evaluations for gradient estimation. Default: 100.

  • max_num_evals (int) – The maximum number of evaluations for gradient estimation. Default: 1000.

  • stepsize_search (str) – Indicating how to search for stepsize; Possible values are ‘geometric_progression’, ‘grid_search’, ‘geometric_progression’. Default: ‘geometric_progression’.

  • num_iterations (int) – The number of iterations. Default: 20.

  • gamma (float) – Used to set binary search threshold theta. Default: 1.0. For l2 attack the binary search threshold theta is \(gamma / d^{3/2}\). For linf attack is \(gamma / d^2\). Default: 1.0.

  • constraint (str) – The norm distance to optimize. Possible values are ‘l2’, ‘linf’. Default: l2.

  • batch_size (int) – Batch size. Default: 32.

  • clip_min (float, optional) – The minimum image component value. Default: 0.

  • clip_max (float, optional) – The maximum image component value. Default: 1.

  • sparse (bool) – If True, input labels are sparse-encoded. If False, input labels are one-hot-encoded. Default: True.

Raises
  • ValueError – If stepsize_search not in [‘geometric_progression’, ‘grid_search’]

  • ValueError – If constraint not in [‘l2’, ‘linf’]

Examples

>>> from mindspore import Tensor
>>> from mindarmour import BlackModel
>>> import mindspore.ops.operations as P
>>> from mindarmour.adv_robustness.attacks import HopSkipJumpAttack
>>> class Net(nn.Cell):
...     def __init__(self):
...         super(Net, self).__init__()
...         self._softmax = P.Softmax()
...         self._reduce = P.ReduceSum()
...         self._squeeze = P.Squeeze(1)
...     def construct(self, inputs):
...         out = self._softmax(inputs)
...         out = self._reduce(out, 2)
...         out = self._squeeze(out)
...         return out
>>> class ModelToBeAttacked(BlackModel):
...     def __init__(self, network):
...         super(ModelToBeAttacked, self).__init__()
...         self._network = network
...     def predict(self, inputs):
...         if len(inputs.shape) == 3:
...             inputs = inputs[np.newaxis, :]
...         result = self._network(Tensor(inputs.astype(np.float32)))
...         return result.asnumpy()
>>> net = Net()
>>> model = ModelToBeAttacked(net)
>>> attack = HopSkipJumpAttack(model)
>>> n, c, h, w = 1, 1, 32, 32
>>> class_num = 3
>>> x_test = np.asarray(np.random.random((n,c,h,w)), np.float32)
>>> y_test = np.random.randint(0, class_num, size=n)
>>> _, adv_x, _= attack.generate(x_test, y_test)
generate(inputs, labels)[source]

Generate adversarial images in a for loop.

Parameters
Returns

  • numpy.ndarray, bool values for each attack result.

  • numpy.ndarray, generated adversarial examples.

  • numpy.ndarray, query times for each sample.

set_target_images(target_images)[source]

Setting target images for target attack.

Parameters

target_images (numpy.ndarray) – Target images.

class mindarmour.adv_robustness.attacks.IterativeGradientMethod(network, eps=0.3, eps_iter=0.1, bounds=(0.0, 1.0), nb_iter=5, loss_fn=None)[source]

Abstract base class for all iterative gradient based attacks.

Parameters
  • network (Cell) – Target model.

  • eps (float) – Proportion of adversarial perturbation generated by the attack to data range. Default: 0.3.

  • eps_iter (float) – Proportion of single-step adversarial perturbation generated by the attack to data range. Default: 0.1.

  • bounds (tuple) – Upper and lower bounds of data, indicating the data range. In form of (clip_min, clip_max). Default: (0.0, 1.0).

  • nb_iter (int) – Number of iteration. Default: 5.

  • loss_fn (Loss) – Loss function for optimization. If None, the input network is already equipped with loss function. Default: None.

abstract generate(inputs, labels)[source]

Generate adversarial examples based on input samples and original/target labels.

Parameters
  • inputs (Union[numpy.ndarray, tuple]) – Benign input samples used as references to create adversarial examples.

  • labels (Union[numpy.ndarray, tuple]) – Original/target labels. For each input if it has more than one label, it is wrapped in a tuple.

Raises

NotImplementedError – This function is not available in IterativeGradientMethod.

class mindarmour.adv_robustness.attacks.JSMAAttack(network, num_classes, box_min=0.0, box_max=1.0, theta=1.0, max_iteration=1000, max_count=3, increase=True, sparse=True)[source]

Jacobian-based Saliency Map Attack is a targeted and iterative attack based on saliency map of the input features. It uses the gradient of loss with each class labels with respect to every component of the input. Then a saliency map is used to select the dimension which produces the maximum error.

Reference: The limitations of deep learning in adversarial settings

Parameters
  • network (Cell) – Target model.

  • num_classes (int) – Number of labels of model output, which should be greater than zero.

  • box_min (float) – Lower bound of input of the target model. Default: 0.

  • box_max (float) – Upper bound of input of the target model. Default: 1.0.

  • theta (float) – Change ratio of one pixel (relative to input data range). Default: 1.0.

  • max_iteration (int) – Maximum round of iteration. Default: 1000.

  • max_count (int) – Maximum times to change each pixel. Default: 3.

  • increase (bool) – If True, increase perturbation. If False, decrease perturbation. Default: True.

  • sparse (bool) – If True, input labels are sparse-coded. If False, input labels are onehot-coded. Default: True.

Examples

>>> from mindarmour.adv_robustness.attacks import JSMAAttack
>>> class Net(nn.Cell):
...     def __init__(self):
...         super(Net, self).__init__()
...         self._relu = nn.ReLU()
...     def construct(self, inputs):
...         out = self._relu(inputs)
...         return out
>>> net = Net()
>>> input_shape = (1, 5)
>>> batch_size, classes = input_shape
>>> input_np = np.random.random(input_shape).astype(np.float32)
>>> label_np = np.random.randint(classes, size=batch_size)
>>> attack = JSMAAttack(net, classes, max_iteration=5)
>>> advs = attack.generate(input_np, label_np)
generate(inputs, labels)[source]

Generate adversarial examples in batch.

Parameters
Returns

numpy.ndarray, adversarial samples.

class mindarmour.adv_robustness.attacks.LBFGS(network, eps=1e-05, bounds=(0.0, 1.0), is_targeted=True, nb_iter=150, search_iters=30, loss_fn=None, sparse=False)[source]

In L-BFGS-B attack, the Limited-Memory BFGS optimizaiton algorithm is used to minimize the distance between the inputs and the adversarial examples.

References: Pedro Tabacof, Eduardo Valle. “Exploring the Space of Adversarial Images”

Parameters
  • network (Cell) – The network of attacked model.

  • eps (float) – Attack step size. Default: 1e-5.

  • bounds (tuple) – Upper and lower bounds of data. Default: (0.0, 1.0)

  • is_targeted (bool) – If True, targeted attack. If False, untargeted attack. Default: True.

  • nb_iter (int) – Number of iteration of lbfgs-optimizer, which should be greater than zero. Default: 150.

  • search_iters (int) – Number of changes in step size, which should be greater than zero. Default: 30.

  • loss_fn (Functions) – Loss function of substitute model. Default: None.

  • sparse (bool) – If True, input labels are sparse-coded. If False, input labels are onehot-coded. Default: False.

Examples

>>> from mindarmour.adv_robustness.attacks import LBFGS
>>> import mindspore.ops.operations as P
>>> class Net(nn.Cell):
...     def __init__(self):
...         super(Net, self).__init__()
...         self._softmax = P.Softmax()
...         self._reduce = P.ReduceSum()
...         self._squeeze = P.Squeeze(1)
...     def construct(self, inputs):
...         out = self._softmax(inputs)
...         out = self._reduce(out, 2)
...         out = self._squeeze(out)
...         return out
>>> net = Net()
>>> classes = 10
>>> attack = LBFGS(net, is_targeted=True)
>>> input_np = np.asarray(np.random.random((1,1,32,32)), np.float32)
>>> label_np = np.array([3]).astype(np.int64)
>>> target_np = np.array([7]).astype(np.int64)
>>> target_np = np.eye(10)[target_np].astype(np.float32)
>>> adv = attack.generate(input_np, target_np)
generate(inputs, labels)[source]

Generate adversarial examples based on input data and target labels.

Parameters
  • inputs (numpy.ndarray) – Benign input samples used as references to create adversarial examples.

  • labels (numpy.ndarray) – Original/target labels.

Returns

numpy.ndarray, generated adversarial examples.

class mindarmour.adv_robustness.attacks.LeastLikelyClassMethod(network, eps=0.07, alpha=None, bounds=(0.0, 1.0), loss_fn=None)[source]

The Single Step Least-Likely Class Method, a variant of FGSM, targets the least-likely class to generate the adversarial examples.

References: F. Tramer, et al., “Ensemble adversarial training: Attacks and defenses,” in ICLR, 2018

Parameters
  • network (Cell) – Target model.

  • eps (float) – Proportion of single-step adversarial perturbation generated by the attack to data range. Default: 0.07.

  • alpha (float) – Proportion of single-step random perturbation to data range. Default: None.

  • bounds (tuple) – Upper and lower bounds of data, indicating the data range. In form of (clip_min, clip_max). Default: (0.0, 1.0).

  • loss_fn (Loss) – Loss function for optimization. If None, the input network is already equipped with loss function. Default: None.

Examples

>>> from mindarmour.adv_robustness.attacks import LeastLikelyClassMethod
>>> class Net(nn.Cell):
...     def __init__(self):
...         super(Net, self).__init__()
...         self._relu = nn.ReLU()
...     def construct(self, inputs):
...         out = self._relu(inputs)
...         return out
>>> inputs = np.asarray([[0.1, 0.2, 0.7]], np.float32)
>>> labels = np.asarray([2],np.int32)
>>> labels = np.eye(3)[labels].astype(np.float32)
>>> net = Net()
>>> attack = LeastLikelyClassMethod(net, loss_fn=nn.SoftmaxCrossEntropyWithLogits(sparse=False))
>>> adv_x = attack.generate(inputs, labels)
class mindarmour.adv_robustness.attacks.MomentumDiverseInputIterativeMethod(network, eps=0.3, bounds=(0.0, 1.0), is_targeted=False, norm_level='l1', prob=0.5, loss_fn=None)[source]

The Momentum Diverse Input Iterative Method attack is a momentum iterative method, and applies random transformation to the input data at each iteration. Such transformation on the input data could improve the transferability of the adversarial examples.

References: Xie, Cihang and Zhang, et al., “Improving Transferability of Adversarial Examples With Input Diversity,” in CVPR, 2019

Parameters
  • network (Cell) – Target model.

  • eps (float) – Proportion of adversarial perturbation generated by the attack to data range. Default: 0.3.

  • bounds (tuple) – Upper and lower bounds of data, indicating the data range. In form of (clip_min, clip_max). Default: (0.0, 1.0).

  • is_targeted (bool) – If True, targeted attack. If False, untargeted attack. Default: False.

  • norm_level (Union[int, numpy.inf]) – Order of the norm. Possible values: np.inf, 1 or 2. Default: ‘l1’.

  • prob (float) – Transformation probability. Default: 0.5.

  • loss_fn (Loss) – Loss function for optimization. If None, the input network is already equipped with loss function. Default: None.

Examples

>>> from mindspore.ops import operations as P
>>> from mindarmour.adv_robustness.attacks import MomentumDiverseInputIterativeMethod
>>> class Net(nn.Cell):
...     def __init__(self):
...         super(Net, self).__init__()
...         self._softmax = P.Softmax()
...     def construct(self, inputs):
...         out = self._softmax(inputs)
...         return out
>>> net = Net()
>>> attack = MomentumDiverseInputIterativeMethod(net, loss_fn=nn.SoftmaxCrossEntropyWithLogits(sparse=False))
>>> inputs = np.asarray([[0.1, 0.2, 0.7]], np.float32)
>>> labels = np.asarray([2],np.int32)
>>> labels = np.eye(3)[labels].astype(np.float32)
>>> net = Net()
>>> adv_x = attack.generate(inputs, labels)
class mindarmour.adv_robustness.attacks.MomentumIterativeMethod(network, eps=0.3, eps_iter=0.1, bounds=(0.0, 1.0), is_targeted=False, nb_iter=5, decay_factor=1.0, norm_level='inf', loss_fn=None)[source]

The Momentum Iterative Method attack accelerates the gradient descent algorithm, such as FGSM, FGM, and LLCM, by accumulating a velocity vector in the gradient direction of the loss function across iterations, and thus generates the adversarial examples.

References: Y. Dong, et al., “Boosting adversarial attacks with momentum,” arXiv:1710.06081, 2017

Parameters
  • network (Cell) – Target model.

  • eps (float) – Proportion of adversarial perturbation generated by the attack to data range. Default: 0.3.

  • eps_iter (float) – Proportion of single-step adversarial perturbation generated by the attack to data range. Default: 0.1.

  • bounds (tuple) – Upper and lower bounds of data, indicating the data range. In form of (clip_min, clip_max). Default: (0.0, 1.0).

  • is_targeted (bool) – If True, targeted attack. If False, untargeted attack. Default: False.

  • nb_iter (int) – Number of iteration. Default: 5.

  • decay_factor (float) – Decay factor in iterations. Default: 1.0.

  • norm_level (Union[int, numpy.inf]) – Order of the norm. Possible values: np.inf, 1 or 2. Default: ‘inf’.

  • loss_fn (Loss) – Loss function for optimization. If None, the input network is already equipped with loss function. Default: None.

Examples

>>> from mindspore.ops import operations as P
>>> from mindarmour.adv_robustness.attacks import MomentumIterativeMethod
>>> class Net(nn.Cell):
...     def __init__(self):
...         super(Net, self).__init__()
...         self._softmax = P.Softmax()
...     def construct(self, inputs):
...         out = self._softmax(inputs)
...         return out
>>> net = Net()
>>> attack = MomentumIterativeMethod(net, loss_fn=nn.SoftmaxCrossEntropyWithLogits(sparse=False))
>>> inputs = np.asarray([[0.1, 0.2, 0.7]], np.float32)
>>> labels = np.asarray([2],np.int32)
>>> labels = np.eye(3)[labels].astype(np.float32)
>>> net = Net()
>>> adv_x = attack.generate(inputs, labels)
generate(inputs, labels)[source]

Generate adversarial examples based on input data and origin/target labels.

Parameters
  • inputs (Union[numpy.ndarray, tuple]) – Benign input samples used as references to create adversarial examples.

  • labels (Union[numpy.ndarray, tuple]) – Original/target labels. For each input if it has more than one label, it is wrapped in a tuple.

Returns

numpy.ndarray, generated adversarial examples.

class mindarmour.adv_robustness.attacks.NES(model, scene, max_queries=10000, top_k=- 1, num_class=10, batch_size=128, epsilon=0.3, samples_per_draw=128, momentum=0.9, learning_rate=0.001, max_lr=0.05, min_lr=0.0005, sigma=0.001, plateau_length=20, plateau_drop=2.0, adv_thresh=0.25, zero_iters=10, starting_eps=1.0, starting_delta_eps=0.5, label_only_sigma=0.001, conservative=2, sparse=True)[source]

The class is an implementation of the Natural Evolutionary Strategies Attack Method. NES uses natural evolutionary strategies to estimate gradients to improve query efficiency. NES covers three settings: Query-Limited setting, Partial-Information setting and Label-Only setting. In the query-limit setting, the attack has a limited number of queries to the target model but access to the probabilities of all classes. In the partial-info setting, the attack only has access to the probabilities for top-k classes. In the label-only setting, the attack only has access to a list of k inferred labels ordered by their predicted probabilities. In the Partial-Information setting and Label-Only setting, NES do target attack so user need to use set_target_images method to set target images of target classes.

References: Andrew Ilyas, Logan Engstrom, Anish Athalye, and Jessy Lin. Black-box adversarial attacks with limited queries and information. In ICML, July 2018

Parameters
  • model (BlackModel) – Target model to be attacked.

  • scene (str) – Scene in ‘Label_Only’, ‘Partial_Info’ or ‘Query_Limit’.

  • max_queries (int) – Maximum query numbers to generate an adversarial example. Default: 10000.

  • top_k (int) – For Partial-Info or Label-Only setting, indicating how much (Top-k) information is available for the attacker. For Query-Limited setting, this input should be set as -1. Default: -1.

  • num_class (int) – Number of classes in dataset. Default: 10.

  • batch_size (int) – Batch size. Default: 128.

  • epsilon (float) – Maximum perturbation allowed in attack. Default: 0.3.

  • samples_per_draw (int) – Number of samples draw in antithetic sampling. Default: 128.

  • momentum (float) – Momentum. Default: 0.9.

  • learning_rate (float) – Learning rate. Default: 1e-3.

  • max_lr (float) – Max Learning rate. Default: 5e-2.

  • min_lr (float) – Min Learning rate. Default: 5e-4.

  • sigma (float) – Step size of random noise. Default: 1e-3.

  • plateau_length (int) – Length of plateau used in Annealing algorithm. Default: 20.

  • plateau_drop (float) – Drop of plateau used in Annealing algorithm. Default: 2.0.

  • adv_thresh (float) – Threshold of adversarial. Default: 0.25.

  • zero_iters (int) – Number of points to use for the proxy score. Default: 10.

  • starting_eps (float) – Starting epsilon used in Label-Only setting. Default: 1.0.

  • starting_delta_eps (float) – Delta epsilon used in Label-Only setting. Default: 0.5.

  • label_only_sigma (float) – Sigma used in Label-Only setting. Default: 1e-3.

  • conservative (int) – Conservation used in epsilon decay, it will increase if no convergence. Default: 2.

  • sparse (bool) – If True, input labels are sparse-encoded. If False, input labels are one-hot-encoded. Default: True.

Examples

>>> from mindspore import Tensor
>>> from mindarmour import BlackModel
>>> import mindspore.ops.operations as P
>>> from mindarmour.adv_robustness.attacks import NES
>>> class Net(nn.Cell):
...     def __init__(self):
...         super(Net, self).__init__()
...         self._softmax = P.Softmax()
...         self._reduce = P.ReduceSum()
...         self._squeeze = P.Squeeze(1)
...     def construct(self, inputs):
...         out = self._softmax(inputs)
...         out = self._reduce(out, 2)
...         out = self._squeeze(out)
...         return out
>>> class ModelToBeAttacked(BlackModel):
...     def __init__(self, network):
...         super(ModelToBeAttacked, self).__init__()
...         self._network = network
...     def predict(self, inputs):
...         if len(inputs.shape) == 1:
...             inputs = np.expand_dims(inputs, axis=0)
...         result = self._network(Tensor(inputs.astype(np.float32)))
...         return result.asnumpy()
>>> net = Net()
>>> model = ModelToBeAttacked(net)
>>> SCENE = 'Query_Limit'
>>> TOP_K = -1
>>> attack= NES(model, SCENE, top_k=TOP_K)
>>> num_class = 5
>>> x_test = np.asarray(np.random.random((1, 1, 32, 32)), np.float32)
>>> target_image  = np.asarray(np.random.random((1, 1, 32, 32)), np.float32)
>>> orig_class = 0
>>> target_class = 2
>>> attack.set_target_images(target_image)
>>> tag, adv, queries = attack.generate(np.array(x_test), np.array([target_class]))
generate(inputs, labels)[source]

Generate adversarial examples based on input data and target labels.

Parameters
Returns

  • numpy.ndarray, bool values for each attack result.

  • numpy.ndarray, generated adversarial examples.

  • numpy.ndarray, query times for each sample.

Raises
  • ValueError – If the top_k less than 0 in Label-Only or Partial-Info setting.

  • ValueError – If the target_imgs is None in Label-Only or Partial-Info setting.

  • ValueError – If scene is not in [‘Label_Only’, ‘Partial_Info’, ‘Query_Limit’]

set_target_images(target_images)[source]

Set target samples for target attack in the Partial-Info setting or Label-Only setting.

Parameters

target_images (numpy.ndarray) – Target samples for target attack.

class mindarmour.adv_robustness.attacks.PSOAttack(model, model_type='classification', targeted=False, reserve_ratio=0.3, sparse=True, step_size=0.5, per_bounds=0.6, c1=2.0, c2=2.0, c=2.0, pop_size=6, t_max=1000, pm=0.5, bounds=None)[source]

The PSO Attack represents the black-box attack based on Particle Swarm Optimization algorithm, which belongs to differential evolution algorithms. This attack was proposed by Rayan Mosli et al. (2019).

References: Rayan Mosli, Matthew Wright, Bo Yuan, Yin Pan, “They Might NOT Be Giants: Crafting Black-Box Adversarial Examples with Fewer Queries Using Particle Swarm Optimization”, arxiv: 1909.07490, 2019.

Parameters
  • model (BlackModel) – Target model.

  • step_size (Union[int, float]) – Attack step size. Default: 0.5.

  • per_bounds (Union[int, float]) – Relative variation range of perturbations. Default: 0.6.

  • c1 (Union[int, float]) – Weight coefficient. Default: 2.

  • c2 (Union[int, float]) – Weight coefficient. Default: 2.

  • c (Union[int, float]) – Weight of perturbation loss. Default: 2.

  • pop_size (int) – The number of particles, which should be greater than zero. Default: 6.

  • t_max (int) – The maximum round of iteration for each adversarial example, which should be greater than zero. Default: 1000.

  • pm (Union[int, float]) – The probability of mutations, which should be in the range of (0, 1). Default: 0.5.

  • bounds (Union[list, tuple, None]) – Upper and lower bounds of data. In form of (clip_min, clip_max). Default: None.

  • targeted (bool) – If True, turns on the targeted attack. If False, turns on untargeted attack. It should be noted that only untargeted attack is supported for model_type=’detection’, Default: False.

  • sparse (bool) – If True, input labels are sparse-encoded. If False, input labels are one-hot-encoded. Default: True.

  • model_type (str) – The type of targeted model. ‘classification’ and ‘detection’ are supported now. default: ‘classification’.

  • reserve_ratio (Union[int, float]) – The percentage of objects that can be detected after attacks, specifically for model_type=’detection’. Reserve_ratio should be in the range of (0, 1). Default: 0.3.

Examples

>>> import mindspore.nn as nn
>>> from mindspore import Tensor
>>> from mindspore.nn import Cell
>>> from mindarmour import BlackModel
>>> from mindarmour.adv_robustness.attacks import PSOAttack
>>> class ModelToBeAttacked(BlackModel):
...     def __init__(self, network):
...         super(ModelToBeAttacked, self).__init__()
...         self._network = network
...     def predict(self, inputs):
...         if len(inputs.shape) == 1:
...             inputs = np.expand_dims(inputs, axis=0)
...         result = self._network(Tensor(inputs.astype(np.float32)))
...         return result.asnumpy()
>>> class Net(Cell):
...     def __init__(self):
...         super(Net, self).__init__()
...         self._relu = nn.ReLU()
...     def construct(self, inputs):
...         out = self._relu(inputs)
...         return out
>>> net = Net()
>>> model = ModelToBeAttacked(net)
>>> attack = PSOAttack(model, bounds=(0.0, 1.0), pm=0.5, sparse=False)
>>> batch_size = 6
>>> x_test = np.random.rand(batch_size, 10)
>>> y_test = np.random.randint(low=0, high=10, size=batch_size)
>>> y_test = np.eye(10)[y_test]
>>> y_test = y_test.astype(np.float32)
>>> _, adv_data, _ = attack.generate(x_test, y_test)
generate(inputs, labels)[source]

Generate adversarial examples based on input data and targeted labels (or ground_truth labels).

Parameters
  • inputs (Union[numpy.ndarray, tuple]) – Input samples. The format of inputs should be numpy.ndarray if model_type=’classification’. The format of inputs can be (input1, input2, …) or only one array if model_type=’detection’.

  • labels (Union[numpy.ndarray, tuple]) – Targeted labels or ground-truth labels. The format of labels should be numpy.ndarray if model_type=’classification’. The format of labels should be (gt_boxes, gt_labels) if model_type=’detection’.

Returns

  • numpy.ndarray, bool values for each attack result.

  • numpy.ndarray, generated adversarial examples.

  • numpy.ndarray, query times for each sample.

class mindarmour.adv_robustness.attacks.PointWiseAttack(model, max_iter=1000, search_iter=10, is_targeted=False, init_attack=None, sparse=True)[source]

The Pointwise Attack make sure use the minimum number of changed pixels to generate adversarial sample for each original sample.Those changed pixels will use binary search to make sure the distance between adversarial sample and original sample is as close as possible.

References: L. Schott, J. Rauber, M. Bethge, W. Brendel: “Towards the first adversarially robust neural network model on MNIST”, ICLR (2019)

Parameters
  • model (BlackModel) – Target model.

  • max_iter (int) – Max rounds of iteration to generate adversarial image. Default: 1000.

  • search_iter (int) – Max rounds of binary search. Default: 10.

  • is_targeted (bool) – If True, targeted attack. If False, untargeted attack. Default: False.

  • init_attack (Attack) – Attack used to find a starting point. Default: None.

  • sparse (bool) – If True, input labels are sparse-encoded. If False, input labels are one-hot-encoded. Default: True.

Examples

>>> from mindspore import Tensor
>>> from mindarmour import BlackModel
>>> import mindspore.ops.operations as P
>>> from mindarmour.adv_robustness.attacks import PointWiseAttack
>>> class Net(nn.Cell):
...     def __init__(self):
...         super(Net, self).__init__()
...         self._softmax = P.Softmax()
...         self._reduce = P.ReduceSum()
...         self._squeeze = P.Squeeze(1)
...     def construct(self, inputs):
...         out = self._softmax(inputs)
...         out = self._reduce(out, 2)
...         out = self._squeeze(out)
...         return out
>>> class ModelToBeAttacked(BlackModel):
...     def __init__(self, network):
...         super(ModelToBeAttacked, self).__init__()
...         self._network = network
...     def predict(self, inputs):
...         result = self._network(Tensor(inputs.astype(np.float32)))
...         return result.asnumpy()
>>> net = Net()
>>> np.random.seed(5)
>>> model = ModelToBeAttacked(net)
>>> attack = PointWiseAttack(model)
>>> x_test = np.asarray(np.random.random((1,1,32,32)), np.float32)
>>> y_test = np.random.randint(0, 3, size=1)
>>> is_adv_list, adv_list, query_times_each_adv = attack.generate(x_test, y_test)
generate(inputs, labels)[source]

Generate adversarial examples based on input samples and targeted labels.

Parameters
  • inputs (numpy.ndarray) – Benign input samples used as references to create adversarial examples.

  • labels (numpy.ndarray) – For targeted attack, labels are adversarial target labels. For untargeted attack, labels are ground-truth labels.

Returns

  • numpy.ndarray, bool values for each attack result.

  • numpy.ndarray, generated adversarial examples.

  • numpy.ndarray, query times for each sample.

class mindarmour.adv_robustness.attacks.ProjectedGradientDescent(network, eps=0.3, eps_iter=0.1, bounds=(0.0, 1.0), is_targeted=False, nb_iter=5, norm_level='inf', loss_fn=None)[source]

The Projected Gradient Descent attack is a variant of the Basic Iterative Method in which, after each iteration, the perturbation is projected on an lp-ball of specified radius (in addition to clipping the values of the adversarial sample so that it lies in the permitted data range). This is the attack proposed by Madry et al. for adversarial training.

References: A. Madry, et al., “Towards deep learning models resistant to adversarial attacks,” in ICLR, 2018

Parameters
  • network (Cell) – Target model.

  • eps (float) – Proportion of adversarial perturbation generated by the attack to data range. Default: 0.3.

  • eps_iter (float) – Proportion of single-step adversarial perturbation generated by the attack to data range. Default: 0.1.

  • bounds (tuple) – Upper and lower bounds of data, indicating the data range. In form of (clip_min, clip_max). Default: (0.0, 1.0).

  • is_targeted (bool) – If True, targeted attack. If False, untargeted attack. Default: False.

  • nb_iter (int) – Number of iteration. Default: 5.

  • norm_level (Union[int, numpy.inf]) – Order of the norm. Possible values: np.inf, 1 or 2. Default: ‘inf’.

  • loss_fn (Loss) – Loss function for optimization. If None, the input network is already equipped with loss function. Default: None.

Examples

>>> from mindspore.ops import operations as P
>>> from mindarmour.adv_robustness.attacks import ProjectedGradientDescent
>>> class Net(nn.Cell):
...     def __init__(self):
...         super(Net, self).__init__()
...         self._softmax = P.Softmax()
...     def construct(self, inputs):
...         out = self._softmax(inputs)
...         return out
>>> net = Net()
>>> attack = ProjectedGradientDescent(net, loss_fn=nn.SoftmaxCrossEntropyWithLogits(sparse=False))
>>> inputs = np.asarray([[0.1, 0.2, 0.7]], np.float32)
>>> labels = np.asarray([2],np.int32)
>>> labels = np.eye(3)[labels].astype(np.float32)
>>> net = Net()
>>> adv_x = attack.generate(inputs, labels)
generate(inputs, labels)[source]

Iteratively generate adversarial examples based on BIM method. The perturbation is normalized by projected method with parameter norm_level .

Parameters
  • inputs (Union[numpy.ndarray, tuple]) – Benign input samples used as references to create adversarial examples.

  • labels (Union[numpy.ndarray, tuple]) – Original/target labels. For each input if it has more than one label, it is wrapped in a tuple.

Returns

numpy.ndarray, generated adversarial examples.

class mindarmour.adv_robustness.attacks.RandomFastGradientMethod(network, eps=0.07, alpha=0.035, bounds=(0.0, 1.0), norm_level=2, is_targeted=False, loss_fn=None)[source]

Fast Gradient Method use Random perturbation. An one-step attack based on gradients calculation. The adversarial noises are generated based on the gradients of inputs, and then randomly perturbed.

References: Florian Tramer, Alexey Kurakin, Nicolas Papernot, “Ensemble adversarial training: Attacks and defenses” in ICLR, 2018

Parameters
  • network (Cell) – Target model.

  • eps (float) – Proportion of single-step adversarial perturbation generated by the attack to data range. Default: 0.07.

  • alpha (float) – Proportion of single-step random perturbation to data range. Default: 0.035.

  • bounds (tuple) – Upper and lower bounds of data, indicating the data range. In form of (clip_min, clip_max). Default: (0.0, 1.0).

  • norm_level (Union[int, numpy.inf]) – Order of the norm. Possible values: np.inf, 1 or 2. Default: 2.

  • is_targeted (bool) – If True, targeted attack. If False, untargeted attack. Default: False.

  • loss_fn (Loss) – Loss function for optimization. If None, the input network is already equipped with loss function. Default: None.

Raises

ValueError – eps is smaller than alpha!

Examples

>>> from mindarmour.adv_robustness.attacks import RandomFastGradientMethod
>>> class Net(nn.Cell):
...     def __init__(self):
...         super(Net, self).__init__()
...         self._relu = nn.ReLU()
...     def construct(self, inputs):
...         out = self._relu(inputs)
...         return out
>>> net = Net()
>>> inputs = np.asarray([[0.1, 0.2, 0.7]], np.float32)
>>> labels = np.asarray([2],np.int32)
>>> labels = np.eye(3)[labels].astype(np.float32)
>>> attack = RandomFastGradientMethod(net, loss_fn=nn.SoftmaxCrossEntropyWithLogits(sparse=False))
>>> adv_x = attack.generate(inputs, labels)
class mindarmour.adv_robustness.attacks.RandomFastGradientSignMethod(network, eps=0.07, alpha=0.035, bounds=(0.0, 1.0), is_targeted=False, loss_fn=None)[source]

Fast Gradient Sign Method using random perturbation. The Random Fast Gradient Sign Method attack calculates the gradient of the input data, and then uses the sign of the gradient with random perturbation to create adversarial noises.

References: F. Tramer, et al., “Ensemble adversarial training: Attacks and defenses,” in ICLR, 2018

Parameters
  • network (Cell) – Target model.

  • eps (float) – Proportion of single-step adversarial perturbation generated by the attack to data range. Default: 0.07.

  • alpha (float) – Proportion of single-step random perturbation to data range. Default: 0.035.

  • bounds (tuple) – Upper and lower bounds of data, indicating the data range. In form of (clip_min, clip_max). Default: (0.0, 1.0).

  • is_targeted (bool) – True: targeted attack. False: untargeted attack. Default: False.

  • loss_fn (Loss) – Loss function for optimization. If None, the input network is already equipped with loss function. Default: None.

Raises

ValueError – eps is smaller than alpha!

Examples

>>> from mindarmour.adv_robustness.attacks import RandomFastGradientSignMethod
>>> class Net(nn.Cell):
...     def __init__(self):
...         super(Net, self).__init__()
...         self._relu = nn.ReLU()
...     def construct(self, inputs):
...         out = self._relu(inputs)
...         return out
>>> net = Net()
>>> inputs = np.asarray([[0.1, 0.2, 0.7]], np.float32)
>>> labels = np.asarray([2],np.int32)
>>> labels = np.eye(3)[labels].astype(np.float32)
>>> attack = RandomFastGradientSignMethod(net, loss_fn=nn.SoftmaxCrossEntropyWithLogits(sparse=False))
>>> adv_x = attack.generate(inputs, labels)
class mindarmour.adv_robustness.attacks.RandomLeastLikelyClassMethod(network, eps=0.07, alpha=0.035, bounds=(0.0, 1.0), loss_fn=None)[source]

Least-Likely Class Method use Random perturbation.

The Single Step Least-Likely Class Method with Random Perturbation, a variant of Random FGSM, targets the least-likely class to generate the adversarial examples.

References: F. Tramer, et al., “Ensemble adversarial training: Attacks and defenses,” in ICLR, 2018

Parameters
  • network (Cell) – Target model.

  • eps (float) – Proportion of single-step adversarial perturbation generated by the attack to data range. Default: 0.07.

  • alpha (float) – Proportion of single-step random perturbation to data range. Default: 0.035.

  • bounds (tuple) – Upper and lower bounds of data, indicating the data range. In form of (clip_min, clip_max). Default: (0.0, 1.0).

  • loss_fn (Loss) – Loss function for optimization. If None, the input network is already equipped with loss function. Default: None.

Raises

ValueError – eps is smaller than alpha!

Examples

>>> from mindarmour.adv_robustness.attacks import RandomLeastLikelyClassMethod
>>> class Net(nn.Cell):
...     def __init__(self):
...         super(Net, self).__init__()
...         self._relu = nn.ReLU()
...     def construct(self, inputs):
...         out = self._relu(inputs)
...         return out
>>> inputs = np.asarray([[0.1, 0.2, 0.7]], np.float32)
>>> labels = np.asarray([2],np.int32)
>>> labels = np.eye(3)[labels].astype(np.float32)
>>> net = Net()
>>> attack = RandomLeastLikelyClassMethod(net, loss_fn=nn.SoftmaxCrossEntropyWithLogits(sparse=False))
>>> adv_x = attack.generate(inputs, labels)
class mindarmour.adv_robustness.attacks.SaltAndPepperNoiseAttack(model, bounds=(0.0, 1.0), max_iter=100, is_targeted=False, sparse=True)[source]

Increases the amount of salt and pepper noise to generate adversarial samples.

Parameters
  • model (BlackModel) – Target model.

  • bounds (tuple) – Upper and lower bounds of data. In form of (clip_min, clip_max). Default: (0.0, 1.0)

  • max_iter (int) – Max iteration to generate an adversarial example. Default: 100

  • is_targeted (bool) – If True, targeted attack. If False, untargeted attack. Default: False.

  • sparse (bool) – If True, input labels are sparse-encoded. If False, input labels are one-hot-encoded. Default: True.

Examples

>>> from mindspore import Tensor
>>> from mindarmour import BlackModel
>>> import mindspore.ops.operations as P
>>> from mindarmour.adv_robustness.attacks import SaltAndPepperNoiseAttack
>>> class Net(nn.Cell):
...     def __init__(self):
...         super(Net, self).__init__()
...         self._softmax = P.Softmax()
...         self._reduce = P.ReduceSum()
...         self._squeeze = P.Squeeze(1)
...     def construct(self, inputs):
...         out = self._softmax(inputs)
...         out = self._reduce(out, 2)
...         out = self._squeeze(out)
...         return out
>>> class ModelToBeAttacked(BlackModel):
...     def __init__(self, network):
...         super(ModelToBeAttacked, self).__init__()
...         self._network = network
...     def predict(self, inputs):
...         if len(inputs.shape) == 1:
...             inputs = np.expand_dims(inputs, axis=0)
...         result = self._network(Tensor(inputs.astype(np.float32)))
...         return result.asnumpy()
>>> net = Net()
>>> model = ModelToBeAttacked(net)
>>> attack = SaltAndPepperNoiseAttack(model)
>>> x_test = np.asarray(np.random.random((1,1,32,32)), np.float32)
>>> y_test = np.random.randint(0, 3, size=1)
>>> _, adv_list, _ = attack.generate(x_test, y_test)
generate(inputs, labels)[source]

Generate adversarial examples based on input data and target labels.

Parameters
Returns

  • numpy.ndarray, bool values for each attack result.

  • numpy.ndarray, generated adversarial examples.

  • numpy.ndarray, query times for each sample.