sciai.architecture.MLPShortcut

View Source On Gitee
class sciai.architecture.MLPShortcut(layers, weight_init='xavier_trunc_normal', bias_init='zeros', activation='tanh', last_activation=None)[source]

Multi-layer perceptron with shortcuts. The last layer is without activation function. For details of this ameliorated MLP architecture, please check: Understanding and mitigating gradient pathologies in physics-informed neural networks.

Parameters
  • layers (Union(tuple[int], list[int])) – List of numbers of neurons in each layer, e.g., [2, 10, 10, 1].

  • weight_init (Union(str, Initializer)) – The weight_init parameter for Dense. The dtype is the same as x. The values of str refer to the function initializer. Default: ‘xavier_trunc_normal’.

  • bias_init (Union(str, Initializer)) – The bias_init parameter for Dense. The dtype is same as x. The values of str refer to the function initializer. Default: ‘zeros’.

  • activation (Union(str, Cell, Primitive, FunctionType, None)) – Activation function applied to the output of each fully connected layer excluding the last layer. Both activation name, e.g. ‘relu’, and mindspore activation function, e.g. nn.ReLU(), are supported. Default: ‘tanh’.

  • last_activation (Union(str, Cell, Primitive, FunctionType, None)) – Activation function applied to the output of the last dense layer. The type rule is the same as activation.

Inputs:
  • x (Tensor) - Input Tensor of the network.

Outputs:

Union(Tensor, tuple[Tensor]), Output Tensor of the network.

Raises
  • TypeError – If layers is not list, tuple or any element is not an int.

  • TypeError – If activation is not one of str, Cell, Primitive, FunctionType or None.

  • TypeError – If last_activation is not one of str, Cell, Primitive, FunctionType or None.

  • TypeError – If weight_init is not str or Initializer.

  • TypeError – If bias_init is not str or Initializer.

Supported Platforms:

GPU CPU Ascend

Examples

>>> import mindspore as ms
>>> import numpy as np
>>> from sciai.architecture import MLPShortcut
>>> x = ms.Tensor(np.array([[180, 234, 154], [244, 48, 247]]), ms.float32)
>>> net = MLPShortcut((3, 10, 4))
>>> output = net(x)
>>> print(output.shape)
(2, 4)