mindspore.nn.Conv2dTranspose

class mindspore.nn.Conv2dTranspose(in_channels, out_channels, kernel_size, stride=1, pad_mode='same', padding=0, dilation=1, group=1, has_bias=False, weight_init='normal', bias_init='zeros')[source]

2D transposed convolution layer.

Compute a 2D transposed convolution, which is also known as a deconvolution (although it is not an actual deconvolution). This module can be seen as the gradient of Conv2d with respect to its input.

x is typically of shape \((N, C, H, W)\), where \(N\) is batch size, \(C\) is channel number, \(H\) is the height of the characteristic layer and \(W\) is the width of the characteristic layer.

The pad_mode argument effectively adds \(dilation * (kernel\_size - 1) - padding\) amount of zero padding to both sizes of the input. So that when a Conv2d and a ConvTranspose2d are initialized with same parameters, they are inverses of each other in regard to the input and output shapes. However, when stride > 1, Conv2d maps multiple input shapes to the same output shape. ConvTranspose2d provide padding argument to increase the calculated output shape on one or more side.

The height and width of output are defined as:

if the ‘pad_mode’ is set to be “pad”,

\[ \begin{align}\begin{aligned}H_{out} = (H_{in} - 1) \times \text{stride[0]} - \left (\text{padding[0]} + \text{padding[1]}\right ) + \text{dilation[0]} \times (\text{kernel_size[0]} - 1) + 1\\W_{out} = (W_{in} - 1) \times \text{stride[1]} - \left (\text{padding[2]} + \text{padding[3]}\right ) + \text{dilation[1]} \times (\text{kernel_size[1]} - 1) + 1\end{aligned}\end{align} \]

if the ‘pad_mode’ is set to be “same”,

\[\begin{split}H_{out} = (H_{in} + \text{stride[0]} - 1)/\text{stride[0]} \\ W_{out} = (W_{in} + \text{stride[1]} - 1)/\text{stride[1]}\end{split}\]

if the ‘pad_mode’ is set to be “valid”,

\[\begin{split}H_{out} = (H_{in} - 1) \times \text{stride[0]} + \text{dilation[0]} \times (\text{ks_w[0]} - 1) + 1 \\ W_{out} = (W_{in} - 1) \times \text{stride[1]} + \text{dilation[1]} \times (\text{ks_w[1]} - 1) + 1\end{split}\]

where \(\text{kernel_size[0]}\) is the height of the convolution kernel and \(\text{kernel_size[1]}\) is the width of the convolution kernel.

Parameters
  • in_channels (int) – The number of channels in the input space.

  • out_channels (int) – The number of channels in the output space.

  • kernel_size (Union[int, tuple]) – int or a tuple of 2 integers, which specifies the height and width of the 2D convolution window. Single int means the value is for both the height and the width of the kernel. A tuple of 2 ints means the first value is for the height and the other is for the width of the kernel.

  • stride (Union[int, tuple[int]]) – The distance of kernel moving, an int number that represents the height and width of movement are both strides, or a tuple of two int numbers that represent height and width of movement respectively. Its value must be equal to or greater than 1. Default: 1.

  • pad_mode (str) –

    Select the mode of the pad. The optional values are “pad”, “same”, “valid”. Default: “same”.

    • pad: Implicit paddings on both sides of the input x.

    • same: Adopted the way of completion.

    • valid: Adopted the way of discarding.

  • padding (Union[int, tuple[int]]) – Implicit paddings on both sides of the input x. If padding is one integer, the paddings of top, bottom, left and right are the same, equal to padding. If padding is a tuple with four integers, the paddings of top, bottom, left and right will be equal to padding[0], padding[1], padding[2], and padding[3] accordingly. Default: 0.

  • dilation (Union[int, tuple[int]]) – The data type is int or a tuple of 2 integers. Specifies the dilation rate to use for dilated convolution. If set to be \(k > 1\), there will be \(k - 1\) pixels skipped for each sampling location. Its value must be greater than or equal to 1 and bounded by the height and width of the input x. Default: 1.

  • group (int) – Splits filter into groups, in_channels and out_channels must be divisible by the number of groups. This does not support for Davinci devices when group > 1. Default: 1.

  • has_bias (bool) – Specifies whether the layer uses a bias vector. Default: False.

  • weight_init (Union[Tensor, str, Initializer, numbers.Number]) – Initializer for the convolution kernel. It can be a Tensor, a string, an Initializer or a number. When a string is specified, values from ‘TruncatedNormal’, ‘Normal’, ‘Uniform’, ‘HeUniform’ and ‘XavierUniform’ distributions as well as constant ‘One’ and ‘Zero’ distributions are possible. Alias ‘xavier_uniform’, ‘he_uniform’, ‘ones’ and ‘zeros’ are acceptable. Uppercase and lowercase are both acceptable. Refer to the values of Initializer for more details. Default: ‘normal’.

  • bias_init (Union[Tensor, str, Initializer, numbers.Number]) – Initializer for the bias vector. Possible Initializer and string are the same as ‘weight_init’. Refer to the values of Initializer for more details. Default: ‘zeros’.

Inputs:
  • x (Tensor) - Tensor of shape \((N, C_{in}, H_{in}, W_{in})\).

Outputs:

Tensor of shape \((N, C_{out}, H_{out}, W_{out})\).

Raises
  • TypeError – If in_channels, out_channels or group is not an int.

  • TypeError – If kernel_size, stride, padding or dilation is neither an int not a tuple.

  • ValueError – If in_channels, out_channels, kernel_size, stride or dilation is less than 1.

  • ValueError – If padding is less than 0.

  • ValueError – If pad_mode is not one of ‘same’, ‘valid’, ‘pad’.

  • ValueError – If padding is a tuple whose length is not equal to 4.

  • ValueError – If pad_mode is not equal to ‘pad’ and padding is not equal to (0, 0, 0, 0).

Supported Platforms:

Ascend GPU CPU

Examples

>>> net = nn.Conv2dTranspose(3, 64, 4, has_bias=False, weight_init='normal', pad_mode='pad')
>>> x = Tensor(np.ones([1, 3, 16, 50]), mindspore.float32)
>>> output = net(x).shape
>>> print(output)
(1, 64, 19, 53)