mindspore.Parameter

class mindspore.Parameter(default_input, *args, **kwargs)[source]

An object holding weights of cells, after initialized Parameter is a subtype of Tensor.

Note

In auto_parallel mode of “semi_auto_parallel” and “auto_parallel”, if init Parameter by a Tensor, the type of Parameter will be Tensor. Tensor will save the shape and type info of a tensor with no memory usage. The shape can be changed while compiling for auto-parallel. Call init_data will return a Tensor Parameter with initialized data. If there is an operator in the network that requires part of the inputs to be Parameter, then the Parameters as this part of the inputs are not allowed to be cast. It is recommended to use the default value of name when initialize a parameter as one attribute of a cell, otherwise, the parameter name may be different from expected.

Parameters

default_input (Union[Tensor, int, float, numpy.ndarray, list]) – Parameter data, to initialize the parameter data.
name (str) –
Name of the parameter. Default: None.

1) If the parameter is not given a name, the default name is its variable name. For example, the name of param_a below is name_a, and the name of param_b is the variable name param_b.

2) If parameter in list or tuple is not given a name, will give it a unique name. For example, the names of parameters below are Parameter$1 and Parameter$2.

3) If the parameter is given a name, and the same name exists between different parameters, an exception will be thrown. For example, “its name ‘name_a’ already exists.” will be thrown.

example, the following example will not throw an exception.
requires_grad (bool) – True if the parameter requires gradient. Default: True.
layerwise_parallel (bool) – When layerwise_parallel is true in data/hybrid parallel mode, broadcast and gradients communication would not be applied to parameters. Default: False.
parallel_optimizer (bool) – It is used to filter the weight shard operation in semi auto or auto parallel mode. It works only when enable parallel optimizer in mindspore.context.set_auto_parallel_context(). Default: True.

Examples

>>> import numpy as np
>>> from mindspore import Parameter, Tensor
>>> import mindspore.ops as ops
>>> import mindspore.nn as nn
>>> import mindspore
>>>
>>> class Net(nn.Cell):
...     def __init__(self):
...         super(Net, self).__init__()
...         self.matmul = ops.MatMul()
...         self.weight = Parameter(Tensor(np.ones((1, 2)), mindspore.float32), name="w", requires_grad=True)
...
...     def construct(self, x):
...         out = self.matmul(self.weight, x)
...         return out
>>> net = Net()
>>> x = Tensor(np.ones((2, 1)), mindspore.float32)
>>> print(net(x))
[[2.]]
>>> net.weight.set_data(Tensor(np.zeros((1, 2)), mindspore.float32))
>>> print(net(x))
[[0.]]

property cache_enable: Return whether the parameter is cache enable.

property cache_shape: Return the cache shape corresponding to the parameter if use cache.

clone(init='same')[source]

Clone the parameter.

Parameters: init (Union[Tensor, str, numbers.Number]) – Initialize the shape and dtype of the parameter. If init is a Tensor or numbers.Number, clone a new parameter with the same shape and dtype, and the data of the new parameter will be set according to init. If init is a str, the init should be the alias of the class inheriting from Initializer. For example, if init is ‘same’, clone a new parameter with the same data, shape, and dtype. Default: ‘same’.
Returns: Parameter, a new parameter.

property comm_fusion

Get and set the fusion type (int) for communication operators corresponding to this parameter.

In AUTO_PARALLEL and SEMI_AUTO_PARALLEL mode, some communication operators used for parameters or gradients aggregation are inserted automatically. Set the fusion type for communication operators generated for this parameter. The value of fusion must be greater than or equal to 0. When the value of fusion is 0, operators will not be fused together.

Only support in Ascend environment with Graph mode.

property data: Return the parameter object.

init_data(layout=None, set_sliced=False)[source]

Initialize the parameter’s data.

Parameters

layout (Union[None, tuple(list(int))]) –
Parameter slice layout [dev_mat, tensor_map, slice_shape]. Default: None.
- dev_mat (list(int)): Device matrix.
- tensor_map (list(int)): Tensor map.
- slice_shape (list(int)): Shape of slice.
set_sliced (bool) – True if the parameter is set sliced after initializing the data. Default: False.

Raises

RuntimeError – If it is from Initializer, and parallel mode has changed after the Initializer created.
ValueError – If the length of the layout is less than 3.
TypeError – If layout is not tuple.

Returns

Parameter, the Parameter after initializing data. If current Parameter was already initialized before, returns the same initialized Parameter.

property inited_param

Get the new parameter after call the init_data.

Default is a None, If self is a Parameter without data, after call the init_data the initialized Parameter with data will be recorded here.

property is_init

Get the initialization status of the parameter.

This flag only work in GE, and it will be set to False in other backend.

property layerwise_parallel: When layerwise_parallel is true in data/hybrid parallel mode, broadcast and gradients communication would not be applied to parameters.

property name: Get the name of the parameter.

property parallel_optimizer: It is used to filter the weight shard operation in semi auto or auto parallel mode. It works only when enable parallel optimizer in mindspore.context.set_auto_parallel_context().

property parallel_optimizer_comm_recompute

Get and Set the whether do recompute for communication operators corresponding to this parameter when applying parallel optimizer.

In AUTO_PARALLEL and SEMI_AUTO_PARALLEL mode, when applying parallel optimizer, some all_gather operators used for parameters gathering are inserted automatically. The interface is used to control the recompute attr for those all_gather operators.

Note

Only Ascend and Graph mode is supported.
It is recommended to use cell.recompute(parallel_optimizer_comm_recompute=True/False) to configure the all_gather operators introducing by parallel optimizer rather than using this interface directly.

property requires_grad: Return whether the parameter requires gradient.

set_data(data, slice_shape=False)[source]

Set Parameter’s data.

Parameters

data (Union[Tensor, int, float]) – new data.
slice_shape (bool) – If slice the parameter is set to true, the shape is not checked for consistency. Default: False.

Returns

Parameter, the parameter after set data.

set_param_fl(push_to_server=False, pull_from_server=False, requires_aggr=True)[source]

Set the way of parameter and server interaction.

Parameters

push_to_server (bool) – Whether the parameter should be pushed to server. Default: False.
pull_from_server (bool) – Whether the parameter should be pulled from server. Default: False.
requires_aggr (bool) – Whether the parameter should be aggregated in the server. Default: True.

set_param_ps(init_in_server=False)[source]

Set whether the trainable parameter is updated by parameter server and whether the trainable parameter is initialized on server.

Note

It only works when a running task is in the parameter server mode.

Parameters: init_in_server (bool) – Whether trainable parameter updated by parameter server is initialized on server. Default: False.

property sliced: Get slice status of the parameter.

property unique: whether the parameter is already unique or not.