Comparing the function differences between torch.optim.lr_scheduler.StepLR and torch.optim.lr_scheduler.MultiStepLR

torch.optim.lr_scheduler.StepLR

torch.optim.lr_scheduler.StepLR(
    optimizer,
    step_size,
    gamma=0.1,
    last_epoch=-1
)

For more information, see torch.optim.lr_scheduler.StepLR.

torch.optim.lr_scheduler.MultiStepLR

torch.optim.lr_scheduler.MultiStepLR(
     optimizer,
     milestones,
     gamma=0.1,
     last_epoch=-1
)

For more information, see torch.optim.lr_scheduler.MultiStepLR.

mindspore.nn.piecewise_constant_lr

mindspore.nn.piecewise_constant_lr(
    milestone,
    learning_rates
)

For more information, see mindspore.nn.piecewise_constant_lr.

Differences

PyTorch: torch.optim.lr_scheduler.StepLRcalculates the learning rate by step_size, value of lr will multiply gamma for every fixed step size; torch.optim.lr_scheduler.MultiStepLRccalculates the learning rate by a step size list, value of lr will multiply gamma when the step size is reached in the list. During the training stage, the optimizer is passed to the lr scheduler, which calls the step method to update the current learning rate.

MindSpore: Step size list and the corresponding learning rate value list are passed to the function, and a list of learning rates will be returned as input to the optimizer.

Code Examplefrom mindspore import nn

# In MindSpore：
milestone = [2, 5, 10]
learning_rates = [0.1, 0.05, 0.01]
output = nn.piecewise_constant_lr(milestone, learning_rates)
print(output)
# Out：
# [0.1, 0.1, 0.05, 0.05, 0.05, 0.01, 0.01, 0.01, 0.01, 0.01]


# In torch:
import numpy as np
import torch
from torch import optim

model = torch.nn.Sequential(torch.nn.Linear(20, 1))
optimizer = optim.SGD(model.parameters(), 0.1)
# step_lr
step_lr = optim.lr_scheduler.StepLR(optimizer, step_size=2, gamma=0.9)
# multi_step_lr
multi_step_lr = optim.lr_scheduler.MultiStepLR(optimizer, milestones=[30, 80], gamma=0.9)