mindspore.experimental
The experimental modules.
Experimental Optimizer
API Name |
Description |
Supported Platforms |
Base class for all optimizers. |
|
|
Implements Adadelta algorithm. |
|
|
Implements Adagrad algorithm. |
|
|
Implements Adam algorithm. |
|
|
Implements Adamax algorithm (a variant of Adam based on infinity norm). |
|
|
Implements Adam Weight Decay algorithm. |
|
|
Implements Averaged Stochastic Gradient Descent algorithm. |
|
|
Implements NAdam algorithm. |
|
|
Implements RAdam algorithm. |
|
|
Implements RMSprop algorithm. |
|
|
Implements Rprop algorithm. |
|
|
Stochastic Gradient Descent optimizer. |
|
LRScheduler Class
The dynamic learning rates in this module are all subclasses of LRScheduler, this module should be used with optimizers in mindspore.experimental.optim, pass the optimizer instance to a LRScheduler when used. During the training process, the LRScheduler subclass dynamically changes the learning rate by calling the step method.
import mindspore
from mindspore import nn
from mindspore.experimental import optim
# Define the network structure of LeNet5. Refer to
# https://gitee.com/mindspore/docs/blob/master/docs/mindspore/code/lenet.py
net = LeNet5()
loss_fn = nn.SoftmaxCrossEntropyWithLogits(sparse=True)
optimizer = optim.Adam(net.trainable_params(), lr=0.05)
scheduler = optim.lr_scheduler.StepLR(optimizer, step_size=2, gamma=0.1)
def forward_fn(data, label):
logits = net(data)
loss = loss_fn(logits, label)
return loss, logits
grad_fn = mindspore.value_and_grad(forward_fn, None, optimizer.parameters, has_aux=True)
def train_step(data, label):
(loss, _), grads = grad_fn(data, label)
optimizer(grads)
return loss
for epoch in range(6):
# Create the dataset taking MNIST as an example. Refer to
# https://gitee.com/mindspore/docs/blob/master/docs/mindspore/code/mnist.py
for data, label in create_dataset(need_download=False):
train_step(data, label)
scheduler.step()
API Name |
Description |
Supported Platforms |
Basic class of learning rate schedule. |
|
|
Decays the learning rate of each parameter group by a small constant factor until the number of epoch reaches a pre-defined milestone: total_iters. |
|
|
Set the learning rate of each parameter group using a cosine annealing lr schedule. |
|
|
|
Set the learning rate of each parameter group using a cosine annealing warm restarts schedule. |
|
Sets the learning rate of each parameter group according to cyclical learning rate policy (CLR). |
|
|
For each epoch, the learning rate decays exponentially, multiplied by gamma. |
|
|
Sets the learning rate of each parameter group to the initial lr times a given function. |
|
|
Decays the learning rate of each parameter group by linearly changing small multiplicative factor until the number of epoch reaches a pre-defined milestone: total_iters. |
|
|
Multiply the learning rate of each parameter group by the factor given in the specified function. |
|
|
Multiply the learning rate of each parameter group by gamma once the number of epoch reaches one of the milestones. |
|
|
For each epoch, the learning rate is adjusted by polynomial fitting. |
|
|
Reduce learning rate when a metric has stopped improving. |
|
|
Concatenate multiple learning rate adjustment strategies in schedulers in sequence, switching to the next learning rate adjustment strategy at milestone. |
|
|
Decays the learning rate of each parameter group by gamma every step_size epochs. |
|
Experimental EmbeddingService
ES(EmbeddingService) feature can support model training and inference for PS embedding and data_parallel embedding, and provide unified embedding management, storage, and computing capabilities for training and inference. |
|
Look up a PS embedding. |
|
Look up a data_parallel embedding. |