Differences with torch.nn.Linear

View Source On Gitee

torch.nn.Linear

class torch.nn.Linear(
    in_features,
    out_features,
    bias=True
)(input) -> Tensor

For more information, see torch.nn.Linear.

mindspore.nn.Dense

class mindspore.nn.Dense(
    in_channels,
    out_channels,
    weight_init=None,
    bias_init=None,
    has_bias=True,
    activation=None
)(x) -> Tensor

For more information, see mindspore.nn.Dense.

Differences

Pytorch: Fully connected layer that implements the matrix multiplication operation.

MindSpore: The implementation function of the API in MindSpore is basically the same as that of PyTorch, and it is possible to add activation functions after the fully connected layer.

Weight Initialization Difference

When weight_init of mindspore.nn.Dense is None , the weight is initialized using HeUniform. This is the same as PyTorch weight initialization.

When bias_init of mindspore.nn.Dense is None , the bias is initialized using Uniform. This is the same as the PyTorch bias initialization.

Categories

Subcategories

PyTorch

MindSpore

Differences

Parameters

Parameter 1

in_features

in_channels

Same function, different parameter names

Parameter 2

out_features

out_channels

Same function, different parameter names

Parameter 3

bias

has_bias

Same function, different parameter names

Parameter 4

-

weight_init

Initialization method for the weight parameter, which is not available for PyTorch

Parameter 5

-

bias_init

Initialization method for the bias parameter, which is not available for PyTorch

Parameter 6

-

activation

Activation function applied to the output of the fully connected layer, which is not available for PyTorch

Input

Single input

input

x

Same function, only different parameter names

Code Example

The two APIs achieve the same function and have the same usage.

# PyTorch
import torch
from torch import nn
import numpy as np

net = nn.Linear(3, 4)
x = torch.tensor(np.array([[180, 234, 154], [244, 48, 247]]), dtype=torch.float)
output = net(x)
print(output.detach().numpy().shape)
# (2, 4)

# MindSpore
import mindspore
from mindspore import Tensor, nn
import numpy as np

x = Tensor(np.array([[180, 234, 154], [244, 48, 247]]), mindspore.float32)
net = nn.Dense(3, 4)
output = net(x)
print(output.shape)
# (2, 4)