mindspore.nn.LARS
- class mindspore.nn.LARS(optimizer, epsilon=1e-05, coefficient=0.001, use_clip=False, lars_filter=lambda x: ...)[source]
Implements the LARS algorithm.
LARS is an optimization algorithm employing a large batch optimization technique. Refer to paper LARGE BATCH TRAINING OF CONVOLUTIONAL NETWORKS.
The updating formulas are as follows,
represents the network's params, represents gradients, represents the current step, represents weight_decay in optimizer, represents learning_rate in optimizer, represents coefficient.- Parameters
optimizer (
mindspore.nn.Optimizer
) – MindSpore optimizer for which to wrap and modify gradients.epsilon (float) – Term added to the denominator to improve numerical stability. Default:
1e-05
.coefficient (float) – Trust coefficient for calculating the local learning rate. Default:
0.001
.use_clip (bool) – Whether to use clip operation for calculating the local learning rate. Default:
False
.lars_filter (Function) – A function to determine which of the network parameters to use LARS algorithm. Default:
lambda x: 'LayerNorm' not in x.name and 'bias' not in x.name
.
- Inputs:
gradients (tuple[Tensor]) - The gradients of params in the optimizer, the shape is the as same as the params in the optimizer.
- Supported Platforms:
Ascend
Examples
>>> import mindspore as ms >>> from mindspore import nn >>> >>> # Define the network structure of LeNet5. Refer to >>> # https://gitee.com/mindspore/docs/blob/master/docs/mindspore/code/lenet.py >>> net = LeNet5() >>> loss = nn.SoftmaxCrossEntropyWithLogits() >>> opt = nn.Momentum(net.trainable_params(), 0.1, 0.9) >>> opt_lars = nn.LARS(opt, epsilon=1e-08, coefficient=0.02) >>> model = ms.train.Model(net, loss_fn=loss, optimizer=opt_lars, metrics=None)