mindspore.ops.ApplyMomentum
- class mindspore.ops.ApplyMomentum(use_nesterov=False, use_locking=False, gradient_scale=1.0)[source]
Optimizer that implements the Momentum algorithm.
Refer to the paper On the importance of initialization and momentum in deep learning for more details.
Inputs of variable, accumulation and gradient comply with the implicit type conversion rules to make the data types consistent. If they have different data types, the lower priority data type will be converted to the relatively highest priority data type.
Refer to
mindspore.nn.Momentum
for more details about the formula and usage.- Parameters
- Inputs:
variable (Parameter) - Weights to be updated. Data type must be float.
accumulation (Parameter) - Accumulated gradient value by moment weight, has the same data type with variable.
learning_rate (Union[Number, Tensor]) - The learning rate value, must be a float number or a scalar tensor with float data type.
gradient (Tensor) - Gradient, has the same data type as variable.
momentum (Union[Number, Tensor]) - Momentum, must be a float number or a scalar tensor with float data type.
- Outputs:
Tensor, parameters to be updated.
- Raises
TypeError – If the use_locking or use_nesterov is not a bool or gradient_scale is not a float.
RuntimeError – If the data type of var, accum and grad conversion of Parameter is not supported.
- Supported Platforms:
Ascend
GPU
CPU
Examples
>>> class Net(nn.Cell): ... def __init__(self): ... super(Net, self).__init__() ... self.apply_momentum = ops.ApplyMomentum() ... self.variable = Parameter(Tensor(np.array([[0.6, 0.4], ... [0.1, 0.5]]).astype(np.float32)), name="variable") ... self.accumulate = Parameter(Tensor(np.array([[0.6, 0.5], ... [0.2, 0.6]]).astype(np.float32)), name="accumulate") ... def construct(self, lr, grad, moment): ... out = self.apply_momentum(self.variable, self.accumulate, lr, grad, moment) ... return out >>> net = Net() >>> lr = Tensor(0.1, mindspore.float32) >>> moment = Tensor(0.9, mindspore.float32) >>> grad = Tensor(np.array([[0.3, 0.7], [0.1, 0.8]]).astype(np.float32)) >>> output = net(lr, grad, moment) >>> print(output) [[0.51600003 0.285 ] [0.072 0.366 ]]