sciai.architecture.MLPShortcut
- class sciai.architecture.MLPShortcut(layers, weight_init='xavier_trunc_normal', bias_init='zeros', activation='tanh', last_activation=None)[source]
Multi-layer perceptron with shortcuts. The last layer is without activation function. For details of this ameliorated MLP architecture, please check: Understanding and mitigating gradient pathologies in physics-informed neural networks.
- Parameters
layers (Union(tuple[int], list[int])) – List of numbers of neurons in each layer, e.g., [2, 10, 10, 1].
weight_init (Union(str, Initializer)) – The weight_init parameter for Dense. The dtype is the same as x. The values of str refer to the function initializer. Default: ‘xavier_trunc_normal’.
bias_init (Union(str, Initializer)) – The bias_init parameter for Dense. The dtype is same as x. The values of str refer to the function initializer. Default: ‘zeros’.
activation (Union(str, Cell, Primitive, FunctionType, None)) – Activation function applied to the output of each fully connected layer excluding the last layer. Both activation name, e.g. ‘relu’, and mindspore activation function, e.g. nn.ReLU(), are supported. Default: ‘tanh’.
last_activation (Union(str, Cell, Primitive, FunctionType, None)) – Activation function applied to the output of the last dense layer. The type rule is the same as activation.
- Inputs:
x (Tensor) - Input Tensor of the network.
- Outputs:
Union(Tensor, tuple[Tensor]), Output Tensor of the network.
- Raises
TypeError – If layers is not list, tuple or any element is not an int.
TypeError – If activation is not one of str, Cell, Primitive, FunctionType or None.
TypeError – If last_activation is not one of str, Cell, Primitive, FunctionType or None.
TypeError – If weight_init is not str or Initializer.
TypeError – If bias_init is not str or Initializer.
- Supported Platforms:
GPU
CPU
Ascend
Examples
>>> import mindspore as ms >>> import numpy as np >>> from sciai.architecture import MLPShortcut >>> x = ms.Tensor(np.array([[180, 234, 154], [244, 48, 247]]), ms.float32) >>> net = MLPShortcut((3, 10, 4)) >>> output = net(x) >>> print(output.shape) (2, 4)