mindspore.ops.ApplyMomentum
- class mindspore.ops.ApplyMomentum(use_nesterov=False, use_locking=False, gradient_scale=1.0)[source]
Optimizer that implements the Momentum algorithm.
Refer to the paper On the importance of initialization and momentum in deep learning for more details.
Inputs of variable, accumulation and gradient comply with the implicit type conversion rules to make the data types consistent. If they have different data types, the lower priority data type will be converted to the relatively highest priority data type.
Refer to
mindspore.nn.Momentum
for more details about the formula and usage.- Parameters
- Inputs:
variable (Union[Parameter, Tensor]) - Weights to be updated. Data type must be float64, int64, float, float16, int16, int32, int8, uint16, uint32, uint64, uint8, complex64, complex128.
accumulation (Union[Parameter, Tensor]) - Accumulated gradient value by moment weight, has the same data type with variable.
learning_rate (Union[Number, Tensor]) - The learning rate value, must be a float64, int64, float, float16, int16, int32, int8, uint16, uint32, uint64, uint8, complex64, complex128 number or a scalar tensor with float64, int64, float, float16, int16, int32, int8, uint16, uint32, uint64, uint8, complex64, complex128 data type.
gradient (Tensor) - Gradient, has the same data type as variable.
momentum (Union[Number, Tensor]) - Momentum, must be a float64, int64, float, float16, int16, int32, int8, uint16, uint32, uint64, uint8, complex64, complex128 number or a scalar tensor with float64, int64, float, float16, int16, int32, int8, uint16, uint32, uint64, uint8, complex64, complex128 data type.
- Outputs:
Tensor, parameters to be updated.
- Raises
- Supported Platforms:
Ascend
GPU
CPU
Examples
>>> import mindspore >>> import numpy as np >>> from mindspore import Tensor, nn, ops, Parameter >>> class Net(nn.Cell): ... def __init__(self): ... super(Net, self).__init__() ... self.apply_momentum = ops.ApplyMomentum() ... self.variable = Parameter(Tensor(np.array([[0.6, 0.4], ... [0.1, 0.5]]).astype(np.float32)), name="variable") ... self.accumulate = Parameter(Tensor(np.array([[0.6, 0.5], ... [0.2, 0.6]]).astype(np.float32)), name="accumulate") ... def construct(self, lr, grad, moment): ... out = self.apply_momentum(self.variable, self.accumulate, lr, grad, moment) ... return out >>> net = Net() >>> lr = Tensor(0.1, mindspore.float32) >>> moment = Tensor(0.9, mindspore.float32) >>> grad = Tensor(np.array([[0.3, 0.7], [0.1, 0.8]]).astype(np.float32)) >>> output = net(lr, grad, moment) >>> print(output) [[0.51600003 0.285 ] [0.072 0.366 ]]