mindspore.amp.DynamicLossScaler

class mindspore.amp.DynamicLossScaler(scale_value, scale_factor, scale_window)[source]

Manager for dynamically adjusting the loss scaling factor.

Dynamic loss scaling attempts to determine the largest loss scale scale_value while keeping the gradients finite. If the gradients remain finite for scale_window consecutive steps, it increases the loss scale scale_value by scale_factor, otherwise it decreases the loss scale scale_value by 1 / scale_factor and resets the counter.

Warning

This is an experimental API that is subject to change or deletion.

Parameters

scale_value (Union(float, int)) – The initial loss scale value.
scale_factor (int) – The scale factor.
scale_window (int) – Maximum continuous training steps that do not have overflow to increase the loss scale.

Supported Platforms:: Ascend GPU CPU

Examples

>>> import mindspore
>>> from mindspore import amp, Tensor
>>> import numpy as np
>>> loss_scaler = amp.DynamicLossScaler(scale_value=2**10, scale_factor=2, scale_window=1)
>>> grads = (Tensor(np.array([np.log(-1), 1.0]), mindspore.float16),
...             Tensor(np.array([0.2]), mindspore.float16))
>>> unscaled_grads = loss_scaler.unscale(grads)
>>> grads_finite = amp.all_finite(unscaled_grads)
>>> loss_scaler.adjust(grads_finite)
True
>>> print(loss_scaler.scale_value.asnumpy())
512.0

adjust(grads_finite)[source]

Adjust the scale_value dependent on whether grads are finite.

Parameters: grads_finite (Tensor) – a scalar bool Tensor indicating whether the grads are finite.

Tutorial Examples:

Automatic Mix Precision - Loss Scaling

scale(inputs)[source]

Scaling inputs by scale_value.

Parameters: inputs (Union(Tensor, tuple(Tensor))) – the input loss value or gradients.
Returns: Union(Tensor, tuple(Tensor)), the scaled value.

Tutorial Examples:

Automatic Mix Precision - Loss Scaling

unscale(inputs)[source]

Unscaling inputs by scale_value.

Parameters: inputs (Union(Tensor, tuple(Tensor))) – the input loss value or gradients.
Returns: Union(Tensor, tuple(Tensor)), the unscaled value.

Tutorial Examples:

Automatic Mix Precision - Loss Scaling