mindspore.ops.SGD
- class mindspore.ops.SGD(dampening=0.0, weight_decay=0.0, nesterov=False)[source]
Computes the stochastic gradient descent. Momentum is optional.
Nesterov momentum is based on the formula from paper On the importance of initialization and momentum in deep learning.
Note
For more details, please refer to
nn.SGD
.- Parameters
- Inputs:
parameters (Tensor) - Parameters to be updated. With float16 or float32 data type.
gradient (Tensor) - Gradient, with float16 or float32 data type.
learning_rate (Tensor) - Learning rate, a scalar tensor with float16 or float32 data type. e.g. Tensor(0.1, mindspore.float32)
accum (Tensor) - Accum(velocity) to be updated. With float16 or float32 data type.
momentum (Tensor) - Momentum, a scalar tensor with float16 or float32 data type. e.g. Tensor(0.1, mindspore.float32).
stat (Tensor) - States to be updated with the same shape as gradient, with float16 or float32 data type.
- Outputs:
Tensor, parameters to be updated.
- Raises
TypeError – If dampening or weight_decay is not a float.
TypeError – If nesterov is not a bool.
TypeError – If parameters, gradient, learning_rate, accum, momentum or stat is not a Tensor.
TypeError – If dtype of parameters, gradient, learning_rate, accum, momentum or stat is neither float16 nor float32.
- Supported Platforms:
Ascend
GPU
CPU
Examples
>>> sgd = ops.SGD() >>> parameters = Tensor(np.array([2, -0.5, 1.7, 4]), mindspore.float32) >>> gradient = Tensor(np.array([1, -1, 0.5, 2]), mindspore.float32) >>> learning_rate = Tensor(0.01, mindspore.float32) >>> accum = Tensor(np.array([0.1, 0.3, -0.2, -0.1]), mindspore.float32) >>> momentum = Tensor(0.1, mindspore.float32) >>> stat = Tensor(np.array([1.5, -0.3, 0.2, -0.7]), mindspore.float32) >>> output = sgd(parameters, gradient, learning_rate, accum, momentum, stat) >>> print(output) (Tensor(shape=[4], dtype=Float32, value= [ 1.98989999e+00, -4.90300000e-01, 1.69520009e+00, 3.98009992e+00]),)