sponge.optimizer.SteepestDescent
- class sponge.optimizer.SteepestDescent(params: Union[List[Parameter], List[dict]], learning_rate: Union[float, int, Tensor, Iterable, LearningRateSchedule] = 0.001, weight_decay: Union[float, int] = 0.0, loss_scale: float = 1.0, max_shift: float = None)[source]
Implements the steepest descent (gradient descent) algorithm.
Note
If parameters are not grouped, the weight_decay in optimizer will be applied on the network parameters without 'beta' or 'gamma' in their names. Users can group parameters to change the strategy of decaying weight. When parameters are grouped, each group can set weight_decay. If not, the weight_decay in optimizer will be applied.
- Parameters
params (Union[list[mindspore.Parameter], list[dict]]) –
Must be list of Parameter or list of dict. When the params is a list of dict, the string "params", "lr", "grad_centralization" and "order_params" are the keys can be parsed.
params: Required. Parameters in current group. The value must be a list of Parameter.
lr: Optional. If "lr" in the keys, the value of corresponding learning rate will be used. If not, the learning_rate in optimizer will be used. Fixed and dynamic learning rate are supported.
weight_decay: Using different weight_decay by grouping parameters is currently not supported.
grad_centralization: Optional. Must be Boolean. If "grad_centralization" is in the keys, the set value will be used. If not, the grad_centralization is False by default. This configuration only works on the convolution layer.
order_params: Optional. When parameters is grouped, this usually is used to maintain the order of parameters that appeared in the network to improve performance. The value should be parameters whose order will be followed in optimizer. If order_params in the keys, other keys will be ignored and the element of 'order_params' must be in one group of params.
learning_rate (Union[float, int, Tensor, Iterable, LearningRateSchedule], optional) –
float: The fixed learning rate value. Must be equal to or greater than
0
.int: The fixed learning rate value. Must be equal to or greater than
0
. It will be converted to float.Tensor: Its value should be a scalar or a 1-D vector. For scalar, fixed learning rate will be applied. For vector, learning rate is dynamic, then the i-th step will take the i-th value as the learning rate.
Iterable: Learning rate is dynamic. The i-th step will take the i-th value as the learning rate.
mindspore.nn.LearningRateSchedule: Learning rate is dynamic. During training, the optimizer calls the instance of LearningRateSchedule with step as the input to get the learning rate of current step.
weight_decay (Union[float, int], optional) – An int or a floating point value for the weight decay. It must be equal to or greater than
0
. If the type of weight_decay input is int, it will be converted to float. Default:0.0
.loss_scale (float, optional) – A floating point value for the loss scale. It must be greater than
0
. If the type of loss_scale input is int, it will be converted to float. In general, use the default value. Only when mindspore.amp.FixedLossScaleManager is used for training and the drop_overflow_update in mindspore.amp.FixedLossScaleManager is set toFalse
, this value needs to be the same as the loss_scale in mindspore.amp.FixedLossScaleManager. Refer to class mindspore.amp.FixedLossScaleManager for more details. Default: 1.0.max_shift (float, optional) – A floating point value for the max shift. It must be greater than
0
. It is the bound of the shift distance each iteration in the optimizer. If the max shift is set to be None, we will do nothing to the shift. But if max_shift is a given float number, thus the bound of shift would be: [-max_shift, max_shift] Default:None
.
- Inputs:
gradients (Tensor) - The gradients of the parameters.
- Outputs:
success (bool) - whether the operation is successful.
- Raises
TypeError – If learning_rate is not one of int, float, Tensor, Iterable, LearningRateSchedule.
TypeError – If element of parameters is neither Parameter nor dict.
TypeError – If loss_scale is not a float.
TypeError – If weight_decay is neither float nor int.
ValueError – If loss_scale is less than or equal to
0
.ValueError – If weight_decay is less than
0
.ValueError – If learning_rate is a Tensor, but the dimension of tensor is greater than
1
.
- Supported Platforms:
Ascend
GPU
CPU
Examples
>>> from sponge import Sponge, Molecule, ForceField >>> from sponge.optimizer import SteepestDescent >>> system = Molecule(template='water.tip3p.yaml') >>> potential = ForceField(system, parameters='SPCE') >>> optim = SteepestDescent(params=system.trainable_params(), learning_rate=1e-7) >>> print(system.coordinate.value()) >>> # [[[ 0. 0. 0. ] >>> # [ 0.07907964 0.06120793 0. ] >>> # [-0.07907964 0.06120793 0. ]]] >>> md = Sponge(system, potential, optim) >>> md.run(1000) >>> # [MindSPONGE] Started simulation at 2024-04-29 01:00:42 >>> # [MindSPONGE] Finished simulation at 2024-04-29 01:00:44 >>> # [MindSPONGE] Simulation time: 2.02 seconds. >>> print(system.coordinate.value()) >>> # [[[ 5.3361070e-12 2.3146218e-03 0.0000000e+00] >>> # [ 8.1648827e-02 6.0050689e-02 0.0000000e+00] >>> # [-8.1648827e-02 6.0050689e-02 0.0000000e+00]]]