mindspore.ops.ApplyMomentum

class mindspore.ops.ApplyMomentum(use_nesterov=False, use_locking=False, gradient_scale=1.0)[source]

Optimizer that implements the Momentum algorithm.

Refer to the paper On the importance of initialization and momentum in deep learning for more details.

Inputs of variable, accumulation and gradient comply with the implicit type conversion rules to make the data types consistent. If they have different data types, the lower priority data type will be converted to the relatively highest priority data type.

Refer to mindspore.nn.Momentum for more details about the formula and usage.

Parameters
  • use_locking (bool) – Whether to enable a lock to protect the variable and accumulation tensors from being updated. Default: False.

  • use_nesterov (bool) – Enable Nesterov momentum. Default: False.

  • gradient_scale (float) – The scale of the gradient. Default: 1.0.

Inputs:
  • variable (Parameter) - Weights to be updated. Data type must be float.

  • accumulation (Parameter) - Accumulated gradient value by moment weight, has the same data type with variable.

  • learning_rate (Union[Number, Tensor]) - The learning rate value, must be a float number or a scalar tensor with float data type.

  • gradient (Tensor) - Gradient, has the same data type as variable.

  • momentum (Union[Number, Tensor]) - Momentum, must be a float number or a scalar tensor with float data type.

Outputs:

Tensor, parameters to be updated.

Raises
  • TypeError – If the use_locking or use_nesterov is not a bool or gradient_scale is not a float.

  • RuntimeError – If the data type of var, accum and grad conversion of Parameter is not supported.

Supported Platforms:

Ascend GPU CPU

Examples

>>> class Net(nn.Cell):
...    def __init__(self):
...        super(Net, self).__init__()
...        self.apply_momentum = ops.ApplyMomentum()
...        self.variable = Parameter(Tensor(np.array([[0.6, 0.4],
...                                            [0.1, 0.5]]).astype(np.float32)), name="variable")
...        self.accumulate = Parameter(Tensor(np.array([[0.6, 0.5],
...                                            [0.2, 0.6]]).astype(np.float32)), name="accumulate")
...    def construct(self, lr, grad, moment):
...        out = self.apply_momentum(self.variable, self.accumulate, lr, grad, moment)
...        return out
>>> net = Net()
>>> lr = Tensor(0.1, mindspore.float32)
>>> moment = Tensor(0.9, mindspore.float32)
>>> grad = Tensor(np.array([[0.3, 0.7], [0.1, 0.8]]).astype(np.float32))
>>> output = net(lr, grad, moment)
>>> print(output)
[[0.51600003 0.285     ]
[0.072      0.366     ]]