Document feedback

Question document fragment

When a question document fragment contains a formula, it is displayed as a space.

Submission type
issue

It's a little complicated...

I'd like to ask someone.

Please select the submission type

Problem type
Specifications and Common Mistakes

- Specifications and Common Mistakes:

- Misspellings or punctuation mistakes,incorrect formulas, abnormal display.

- Incorrect links, empty cells, or wrong formats.

- Chinese characters in English context.

- Minor inconsistencies between the UI and descriptions.

- Low writing fluency that does not affect understanding.

- Incorrect version numbers, including software package names and version numbers on the UI.

Usability

- Usability:

- Incorrect or missing key steps.

- Missing main function descriptions, keyword explanation, necessary prerequisites, or precautions.

- Ambiguous descriptions, unclear reference, or contradictory context.

- Unclear logic, such as missing classifications, items, and steps.

Correctness

- Correctness:

- Technical principles, function descriptions, supported platforms, parameter types, or exceptions inconsistent with that of software implementation.

- Incorrect schematic or architecture diagrams.

- Incorrect commands or command parameters.

- Incorrect code.

- Commands inconsistent with the functions.

- Wrong screenshots.

- Sample code running error, or running results inconsistent with the expectation.

Risk Warnings

- Risk Warnings:

- Lack of risk warnings for operations that may damage the system or important data.

Content Compliance

- Content Compliance:

- Contents that may violate applicable laws and regulations or geo-cultural context-sensitive words and expressions.

- Copyright infringement.

Please select the type of question

Problem description

Describe the bug so that we can quickly locate the problem.

mindspore.ops.ApplyAdagradDA

class mindspore.ops.ApplyAdagradDA(use_locking=False)[source]

Update var according to the proximal adagrad scheme. The Adagrad algorithm was proposed in Adaptive Subgradient Methods for Online Learning and Stochastic Optimization.

grad_accum+=gradgrad_squared_accum+=gradgradtmp_val={sign(grad_accum)max{|grad_accum|l1global_step,0} if l1>0grad_accum otherwise x_value=1lrtmp_valy_value=l2global_steplr+grad_squared_accumvar=x_valuey_value

Inputs of var, gradient_accumulator, gradient_squared_accumulator and grad comply with the implicit type conversion rules to make the data types consistent. If they have different data types, the lower priority data type will be converted to the relatively highest priority data type.

Parameters

use_locking (bool) – If True, updating of the var and accum tensors will be protected by a lock. Otherwise the behavior is undefined, but may exhibit less contention. Default: False.

Inputs:
  • var (Parameter) - Variable to be updated. The data type must be float16 or float32. The shape is (N,) where means, any number of additional dimensions.

  • gradient_accumulator (Parameter) - The dict of mutable tensor grad_accum. Must have the same shape and dtype as var.

  • gradient_squared_accumulator (Parameter) - The dict of mutable tensor grad_squared_accum. Must have the same shape and dtype as var.

  • grad (Tensor) - A tensor for gradient. Must have the same shape and dtype as var.

  • lr ([Number, Tensor]) - Scaling factor. Must be a scalar. With float32 or float16 data type.

  • l1 ([Number, Tensor]) - L1 regularization. Must be a scalar. With float32 or float16 data type.

  • l2 ([Number, Tensor]) - L2 regularization. Must be a scalar. With float32 or float16 data type.

  • global_step ([Number, Tensor]) - Training step number. Must be a scalar. With int32 or int64 data type.

Outputs:

Tuple of 3 Tensors, the updated parameters.

  • var (Tensor) - The same shape and data type as var.

  • gradient_accumulator (Tensor) - The same shape and data type as gradient_accumulator.

  • gradient_squared_accumulator (Tensor) - The same shape and data type as gradient_squared_accumulator.

Raises
  • TypeError – If var, gradient_accumulator or gradient_squared_accumulator is not a Parameter.

  • TypeError – If grad is not a Tensor.

  • TypeError – If lr, l1, l2 or global_step is neither a Number nor a Tensor.

  • TypeError – If use_locking is not a bool.

  • TypeError – If dtype of var, gradient_accumulator, gradient_squared_accumulator, grad, lr, l1 or l2 is neither float16 nor float32.

  • TypeError – If dtype of gradient_accumulator, gradient_squared_accumulator or grad is not same as var.

  • TypeError – If dtype of global_step is not int32 nor int64.

  • ValueError – If the shape size of lr, l1, l2 and global_step is not 0.

  • RuntimeError – If the data type of var, gradient_accumulator, gradient_squared_accumulator and grad conversion of Parameter is not supported.

Supported Platforms:

Ascend GPU CPU

Examples

>>> class ApplyAdagradDANet(nn.Cell):
...     def __init__(self, use_locking=False):
...         super(ApplyAdagradDANet, self).__init__()
...         self.apply_adagrad_d_a = ops.ApplyAdagradDA(use_locking)
...         self.var = Parameter(Tensor(np.array([[0.6, 0.4], [0.1, 0.5]]).astype(np.float32)), name="var")
...         self.gradient_accumulator = Parameter(Tensor(np.array([[0.1, 0.3],
...                                                                [0.1, 0.5]]).astype(np.float32)),
...                                               name="gradient_accumulator")
...         self.gradient_squared_accumulator = Parameter(Tensor(np.array([[0.2, 0.1],
...                                                                        [0.1, 0.2]]).astype(np.float32)),
...                                                       name="gradient_squared_accumulator")
...         self.gradient_accumulator = Parameter(Tensor(np.array([[0.1, 0.3],
...                                                                [0.1, 0.5]]).astype(np.float32)),
...                                               name="gradient_accumulator")
...     def construct(self, grad, lr, l1, l2, global_step):
...         out = self.apply_adagrad_d_a(self.var, self.gradient_accumulator,
...                                      self.gradient_squared_accumulator, grad, lr, l1, l2, global_step)
...         return out
...
>>> net = ApplyAdagradDANet()
>>> grad = Tensor(np.array([[0.3, 0.4], [0.1, 0.2]]).astype(np.float32))
>>> lr = Tensor(0.001, mstype.float32)
>>> l1 = Tensor(0.001, mstype.float32)
>>> l2 = Tensor(0.001, mstype.float32)
>>> global_step = Tensor(2, mstype.int32)
>>> output = net(grad, lr, l1, l2, global_step)
>>> print(output)
(Tensor(shape=[2, 2], dtype=Float32, value=
[[-7.39064650e-04, -1.36888528e-03],
 [-5.96988888e-04, -1.42478070e-03]]), Tensor(shape=[2, 2], dtype=Float32, value=
[[ 4.00000006e-01,  7.00000048e-01],
 [ 2.00000003e-01,  6.99999988e-01]]), Tensor(shape=[2, 2], dtype=Float32, value=
[[ 2.90000021e-01,  2.60000020e-01],
 [ 1.09999999e-01,  2.40000010e-01]]))