mindspore.ops.grad
- mindspore.ops.grad(fn, grad_position=0, weights=None, has_aux=False)[源代码]
生成求导函数,用于计算给定函数的梯度。
函数求导包含以下三种场景:
对输入求导,此时 grad_position 非None,而 weights 是None;
对网络变量求导,此时 grad_position 是None,而 weights 非None;
同时对输入和网络变量求导,此时 grad_position 和 weights 都非None。
- 参数:
fn (Union[Cell, Function]) - 待求导的函数或网络。
grad_position (Union[NoneType, int, tuple[int]]) - 指定求导输入位置的索引。若为int类型,表示对单个输入求导;若为tuple类型,表示对tuple内索引的位置求导,其中索引从0开始;若是None,表示不对输入求导,这种场景下, weights 非None。默认值:0。
weights (Union[ParameterTuple, Parameter, list[Parameter]]) - 训练网络中需要返回梯度的网络变量。一般可通过 weights = net.trainable_params() 获取。默认值:None。
has_aux (bool) - 是否返回辅助参数的标志。若为True, fn 输出数量必须超过一个,其中只有 fn 第一个输出参与求导,其他输出值将直接返回。默认值:False。
- 返回:
Function,用于计算给定函数的梯度的求导函数。例如 out1, out2 = fn(*args) ,若 has_aux 为True,梯度函数将返回 (gradient, out2) 形式的结果,其中 out2 不参与求导,若为False,将直接返回 gradient 。
- 异常:
ValueError - 入参 grad_position 和 weights 同时为None。
TypeError - 入参类型不符合要求。
- 支持平台:
Ascend
GPU
CPU
样例:
>>> import numpy as np >>> import mindspore >>> import mindspore.nn as nn >>> from mindspore import Tensor, ops >>> from mindspore.ops import grad >>> >>> # Cell object to be differentiated >>> class Net(nn.Cell): ... def construct(self, x, y, z): ... return x * y * z >>> x = Tensor([1, 2], mindspore.float32) >>> y = Tensor([-2, 3], mindspore.float32) >>> z = Tensor([0, 3], mindspore.float32) >>> net = Net() >>> output = grad(net, grad_position=(1, 2))(x, y, z) >>> print(output) (Tensor(shape=[2], dtype=Float32, value=[ 0.00000000e+00, 6.00000000e+00]), Tensor(shape=[2], dtype=Float32, value=[-2.00000000e+00, 6.00000000e+00])) >>> >>> # Function object to be differentiated >>> def fn(x, y, z): ... res = x * ops.exp(y) * ops.pow(z, 2) ... return res, z >>> x = Tensor([3, 3], mindspore.float32) >>> y = Tensor([0, 0], mindspore.float32) >>> z = Tensor([5, 5], mindspore.float32) >>> gradient, aux = grad(fn, (1, 2), None, True)(x, y, z) >>> print(gradient) (Tensor(shape=[2], dtype=Float32, value= [ 7.50000000e+01, 7.50000000e+01]), Tensor(shape=[2], dtype=Float32, value= [ 3.00000000e+01, 3.00000000e+01])) >>> print(aux) (Tensor(shape=[2], dtype=Float32, value= [ 5.00000000e+00, 5.00000000e+00]),) >>> >>> # For given network to be differentiated with both inputs and weights, there are 3 cases. >>> net = nn.Dense(10, 1) >>> loss_fn = nn.MSELoss() >>> def forward(inputs, labels): ... logits = net(inputs) ... loss = loss_fn(logits, labels) ... return loss, logits >>> inputs = Tensor(np.random.randn(16, 10).astype(np.float32)) >>> labels = Tensor(np.random.randn(16, 1).astype(np.float32)) >>> weights = net.trainable_params() >>> # Case 1: gradient with respect to inputs. >>> # Aux value does not contribute to the gradient. >>> grad_fn = grad(forward, grad_position=(0, 1), weights=None, has_aux=True) >>> inputs_gradient, (aux_logits,) = grad_fn(inputs, labels) >>> print(len(inputs_gradient)) 2 >>> print(aux_logits.shape) (16, 1) >>> >>> # Case 2: gradient with respect to weights. >>> grad_fn = grad(forward, grad_position=None, weights=weights, has_aux=True) >>> params_gradient, (aux_logits,) = grad_fn(inputs, labels) >>> print(len(weights), len(params_gradient)) 2 2 >>> print(aux_logits.shape) (16, 1) >>> >>> # Case 3: gradient with respect to inputs and weights. >>> grad_fn = grad(forward, grad_position=0, weights=weights, has_aux=False) >>> inputs_gradient, params_gradient = grad_fn(inputs, labels) >>> print(len(weights), len(params_gradient)) 2 2