# Differences with torch.nn.parameter.Parameter

torch.nn.parameter.Parameter

torch.nn.parameter.Parameter(data=None, requires_grad=True)

For more details, see torch.nn.parameter.Parameter.

mindspore.Parameter

mindspore.Parameter(default_input, name=None, requires_grad=True, layerwise_parallel=False, parallel_optimizer=True)

For more details, see mindspore.Parameter.

Differences

PyTorch:In PyTorch, there is a special type of tensor known as a "Parameter," which is a subclass of the standard tensor. Unlike regular tensors, Parameters in PyTorch are automatically registered as model parameters, making them subject to updates by optimizers.

MindSpore:In MindSpore, a "Parameter" is also a special type of tensor, but unlike PyTorch, both Parameters and regular tensors inherit from the C interface known as "Tensor_".

Furthermore, there are differences between the requires_grad parameter in MindSpore and PyTorch. In PyTorch, this parameter is a backend-level attribute. When set to False, it indicates that the gradient does not need to be calculated for the tensor, and it won't be included in the computation graph. It also won't record gradient information for each operation, which can improve computational efficiency in scenarios like inference. In MindSpore, this parameter is a frontend-level attribute. When set to False, MindSpore's automatic differentiation mechanism will still compute gradients for the tensor in the backend. It will only affect the way the parameter is presented and used in the frontend. For example, MindSpore's trainable_params method will hide attributes with requires_grad set to False.

Additionally, MindSpore's Parameter has an extra name parameter compared to PyTorch. This parameter is strongly associated with the Parameter and is used during graph compilation in the backend or during checkpoints saving. You can specify this parameter manually, but if you don't, MindSpore will automatically name the Parameter.

Finally, when directly printing a MindSpore Parameter, you cannot view the actual values contained inside it. You need to use the Parameter.asnumpy() method to access the actual values.

Classification

Subclass

PyTorch

MindSpore

difference

parameter

parameter 1

data

default_input

Consistent

parameter 2

-

name

Differences as mentioned above

parameter 3

requires_grad

requires_grad

Differences as mentioned above

parameter 4

-

layerwise_parallel

MindSpore-specific parameter related to parallelism, not present in torch

parameter 5

-

parallel_optimizer

MindSpore-specific parameter related to parallelism, not present in torch

Code Example

import numpy as np
from mindspore import Parameter, Tensor

a = Parameter(Tensor(np.ones((1, 2), dtype=np.float32)))
print(a)
# Parameter (name=Parameter, shape=(1, 2), dtype=Float32, requires_grad=True)
print(a.value())
# [[1. 1.]]

import torch

b = torch.nn.parameter.Parameter(torch.tensor(np.ones((1, 2), dtype=np.float32)))
print(b.data)
# tensor([[1., 1.]])