mindspore.ops.SGD
- class mindspore.ops.SGD(dampening=0.0, weight_decay=0.0, nesterov=False)[source]
Computes the stochastic gradient descent. Momentum is optional.
Nesterov momentum is based on the formula from paper On the importance of initialization and momentum in deep learning.
Note
If parameters are not grouped, the weight_decay in optimizer will be applied on the network parameters without ‘beta’ or ‘gamma’ in their names. Users can group parameters to change the strategy of decaying weight. When parameters are grouped, each group can set weight_decay. If not, the weight_decay in optimizer will be applied. For more details, please refer to
mindspore.nn.SGD
.- Parameters
- Inputs:
parameters (Tensor) - Parameters to be updated. With float16 or float32 data type.
gradient (Tensor) - Gradient, with float16 or float32 data type.
learning_rate (Tensor) - Learning rate, a scalar tensor with float16 or float32 data type. e.g. Tensor(0.1, mindspore.float32)
accum (Tensor) - Accum(velocity) to be updated. With float16 or float32 data type.
momentum (Tensor) - Momentum, a scalar tensor with float16 or float32 data type. e.g. Tensor(0.1, mindspore.float32).
stat (Tensor) - States to be updated with the same shape as gradient, with float16 or float32 data type.
- Outputs:
Tensor, parameters to be updated.
- Raises
TypeError – If dampening or weight_decay is not a float.
TypeError – If nesterov is not a bool.
TypeError – If parameters, gradient, learning_rate, accum, momentum or stat is not a Tensor.
TypeError – If dtype of parameters, gradient, learning_rate, accum, momentum or stat is neither float16 nor float32.
- Supported Platforms:
Ascend
GPU
CPU
Examples
>>> sgd = ops.SGD() >>> parameters = Tensor(np.array([2, -0.5, 1.7, 4]), mindspore.float32) >>> gradient = Tensor(np.array([1, -1, 0.5, 2]), mindspore.float32) >>> learning_rate = Tensor(0.01, mindspore.float32) >>> accum = Tensor(np.array([0.1, 0.3, -0.2, -0.1]), mindspore.float32) >>> momentum = Tensor(0.1, mindspore.float32) >>> stat = Tensor(np.array([1.5, -0.3, 0.2, -0.7]), mindspore.float32) >>> output = sgd(parameters, gradient, learning_rate, accum, momentum, stat) >>> print(output.asnumpy()) [1.99 -0.4903 1.695 3.9801]