mindspore.Parameter
- class mindspore.Parameter(default_input, name=None, requires_grad=True, layerwise_parallel=False, parallel_optimizer=True)[source]
Parameter is a Tensor subclass, when they are assigned as Cell attributes they are automatically added to the list of its parameters, and will appear e.g. in cell.get_parameters() iterator.
Note
In auto_parallel mode of “semi_auto_parallel” and “auto_parallel”, if init Parameter by a Tensor, the type of Parameter will be Tensor. Tensor will save the shape and type info of a tensor with no memory usage. The shape can be changed while compiling for auto-parallel. Call init_data will return a Tensor Parameter with initialized data. If there is an operator in the network that requires part of the inputs to be Parameter, then the Parameters as this part of the inputs are not allowed to be cast. Give each Parameter a unique name to facilitate subsequent operations and updates. If there are two or more Parameter objects with the same name in a network, will be prompted to set a unique name when defining.
- Parameters
default_input (Union[Tensor, int, float, numpy.ndarray, list]) – Parameter data, to initialize the parameter data.
name (str) –
Name of the parameter. Default: None.
1) If the parameter is not given a name, the default name is its variable name. For example, the name of param_a below is name_a, and the name of param_b is the variable name param_b.
self.param_a = Parameter(Tensor([1], ms.float32), name="name_a") self.param_b = Parameter(Tensor([2], ms.float32))
2) If parameter in list or tuple is not given a name, will give it a unique name. For example, the names of parameters below are Parameter$1 and Parameter$2.
self.param_list = [Parameter(Tensor([3], ms.float32)), Parameter(Tensor([4], ms.float32))]
3) If the parameter is given a name, and the same name exists between different parameters, an exception will be thrown. For example, “its name ‘name_a’ already exists.” will be thrown.
self.param_a = Parameter(Tensor([1], ms.float32), name="name_a") self.param_tuple = (Parameter(Tensor([5], ms.float32), name="name_a"), Parameter(Tensor([6], ms.float32)))
4) If a parameter appear multiple times in list or tuple, check the name of the object only once. For example, the following example will not throw an exception.
self.param_a = Parameter(Tensor([1], ms.float32), name="name_a") self.param_tuple = (self.param_a, self.param_a)
requires_grad (bool) – True if the parameter requires gradient. Default: True.
layerwise_parallel (bool) – When layerwise_parallel is true in data/hybrid parallel mode, broadcast and gradients communication would not be applied to parameters. Default: False.
parallel_optimizer (bool) – It is used to filter the weight shard operation in semi auto or auto parallel mode. It works only when enable parallel optimizer in mindspore.context.set_auto_parallel_context(). Default: True.
Examples
>>> import numpy as np >>> from mindspore import Parameter, Tensor >>> import mindspore.ops as ops >>> import mindspore.nn as nn >>> import mindspore >>> >>> class Net(nn.Cell): ... def __init__(self): ... super(Net, self).__init__() ... self.matmul = ops.MatMul() ... self.weight = Parameter(Tensor(np.ones((1, 2)), mindspore.float32), name="w", requires_grad=True) ... ... def construct(self, x): ... out = self.matmul(self.weight, x) ... return out >>> net = Net() >>> x = Tensor(np.ones((2, 1)), mindspore.float32) >>> print(net(x)) [[2.]] >>> net.weight.set_data(Tensor(np.zeros((1, 2)), mindspore.float32)) >>> print(net(x)) [[0.]]
- property cache_enable
Return whether the parameter is cache enable.
- property cache_shape
Return the cache shape corresponding to the parameter if use cache.
- clone(init='same')[source]
Clone the parameter.
- Parameters
init (Union[Tensor, str, numbers.Number]) – Initialize the shape and dtype of the parameter. If init is a Tensor or numbers.Number, clone a new parameter with the same shape and dtype, and the data of the new parameter will be set according to init. If init is a str, the init should be the alias of the class inheriting from Initializer. For example, if init is ‘same’, clone a new parameter with the same data, shape, and dtype. Default: ‘same’.
- Returns
Parameter, a new parameter.
- property comm_fusion
Get the fusion type (int) for communication operators corresponding to this parameter.
In AUTO_PARALLEL and SEMI_AUTO_PARALLEL mode, some communication operators used for parameters or gradients aggregation are inserted automatically. The value of fusion must be greater than or equal to 0. When the value of fusion is 0, operators will not be fused together.
- property data
Return the parameter object.
- init_data(layout=None, set_sliced=False)[source]
Initialize the parameter’s data.
- Parameters
layout (Union[None, tuple]) –
The parameter’s layout info. layout [dev_mat, tensor_map, slice_shape, filed_size, uniform_split, opt_shard_group]. Default: None. It’s not None only in ‘SEMI_AUTO_PARALLEL’ or ‘AUTO_PARALLEL’ mode.
dev_mat (list(int)): The parameter’s device matrix.
tensor_map (list(int)): The parameter’s tensor map.
slice_shape (list(int)): The parameter’s slice shape.
filed_size (int): The parameter’s filed size.
uniform_split (bool): Whether the parameter is split evenly.
opt_shard_group (str): The group of the parameter while running optimizer parallel.
set_sliced (bool) – True if the parameter is set sliced after initializing the data. Default: False.
- Raises
RuntimeError – If it is from Initializer, and parallel mode has changed after the Initializer created.
ValueError – If the length of the layout is less than 6.
TypeError – If layout is not tuple.
- Returns
Parameter, the Parameter after initializing data. If current Parameter was already initialized before, returns the same initialized Parameter.
- property inited_param
Get the new parameter after call the init_data.
Default is a None, If self is a Parameter without data, after call the init_data the initialized Parameter with data will be recorded here.
- property layerwise_parallel
Get the layerwise parallel status(bool) of the parameter.
When layerwise_parallel is true in DATA_PARALLEL and HYBRID_PARALLEL parallel mode, broadcast and gradients communication would not be applied to parameters.
- property name
Get the name of the parameter.
- property parallel_optimizer
Get the optimizer parallel status(bool) of the parameter.
It is used to filter the weight shard operation in AUTO_PARALLEL and SEMI_AUTO_PARALLEL mode. It works only when enable parallel optimizer in mindspore.context.set_auto_parallel_context().
- property parallel_optimizer_comm_recompute
Get the communication recompute status(bool) of optimizer parallel for the parameter.
In AUTO_PARALLEL and SEMI_AUTO_PARALLEL mode, when applying parallel optimizer, some AllGather operators used for parameters gathering are inserted automatically. It is used to control the recompute attr for those AllGather operators.
Note
Only Graph mode is supported.
It is recommended to use cell.recompute(parallel_optimizer_comm_recompute=True/False) to configure the AllGather operators introducing by parallel optimizer rather than using this interface directly.
- property requires_grad
Return whether the parameter requires gradient.
The main function of requires_grad is to tell auto grad to start recording operations on a Tensor. If a Tensor has requires_grad=False, then Tensor requires_grad will make auto grad start recording operations on the tensor.
- set_param_fl(push_to_server=False, pull_from_server=False, requires_aggr=True)[source]
Set the way of parameter and server interaction.
- set_param_ps(init_in_server=False)[source]
Set whether the trainable parameter is updated by parameter server and whether the trainable parameter is initialized on server.
Note
It only works when a running task is in the parameter server mode.
- Parameters
init_in_server (bool) – Whether trainable parameter updated by parameter server is initialized on server. Default: False.
- property sliced
Get slice status of the parameter.
- property unique
Whether the parameter is already unique or not.