mindspore.ops.Conv2D

class mindspore.ops.Conv2D(*args, **kwargs)[source]

2D convolution layer.

Applies a 2D convolution over an input tensor which is typically of shape \((N, C_{in}, H_{in}, W_{in})\), where \(N\) is batch size and \(C_{in}\) is channel number. For each batch of shape \((C_{in}, H_{in}, W_{in})\), the formula is defined as:

\[out_j = \sum_{i=0}^{C_{in} - 1} ccor(W_{ij}, X_i) + b_j,\]

where \(ccor\) is the cross correlation operator, \(C_{in}\) is the input channel number, \(j\) ranges from \(0\) to \(C_{out} - 1\), \(W_{ij}\) corresponds to the \(i\)-th channel of the \(j\)-th filter and \(out_{j}\) corresponds to the \(j\)-th channel of the output. \(W_{ij}\) is a slice of kernel and it has shape \((\text{ks_h}, \text{ks_w})\), where \(\text{ks_h}\) and \(\text{ks_w}\) are the height and width of the convolution kernel. The full kernel has shape \((C_{out}, C_{in} // \text{group}, \text{ks_h}, \text{ks_w})\), where group is the group number to split the input in the channel dimension.

If the ‘pad_mode’ is set to be “valid”, the output height and width will be \(\left \lfloor{1 + \frac{H_{in} + 2 \times \text{padding} - \text{ks_h} - (\text{ks_h} - 1) \times (\text{dilation} - 1) }{\text{stride}}} \right \rfloor\) and \(\left \lfloor{1 + \frac{W_{in} + 2 \times \text{padding} - \text{ks_w} - (\text{ks_w} - 1) \times (\text{dilation} - 1) }{\text{stride}}} \right \rfloor\) respectively.

The first introduction can be found in paper Gradient Based Learning Applied to Document Recognition. More detailed introduction can be found here: http://cs231n.github.io/convolutional-networks/.

Parameters

out_channel (int) – The dimension of the output.
kernel_size (Union[int, tuple[int]]) – The kernel size of the 2D convolution.
mode (int) – Modes for different convolutions. 0 Math convolutiuon, 1 cross-correlation convolution , 2 deconvolution, 3 depthwise convolution. Default: 1.
pad_mode (str) – Modes to fill padding. It could be “valid”, “same”, or “pad”. Default: “valid”.
pad (Union(int, tuple[int])) – The pad value to be filled. Default: 0. If pad is an integer, the paddings of top, bottom, left and right are the same, equal to pad. If pad is a tuple of four integers, the padding of top, bottom, left and right equal to pad[0], pad[1], pad[2], and pad[3] correspondingly.
stride (Union(int, tuple[int])) – The stride to be applied to the convolution filter. Default: 1.
dilation (Union(int, tuple[int])) – Specifies the space to use between kernel elements. Default: 1.
group (int) – Splits input into groups. Default: 1.
data_format (str) – The optional value for data format, is ‘NHWC’ or ‘NCHW’. Default: “NCHW”.

Inputs:

input (Tensor) - Tensor of shape \((N, C_{in}, H_{in}, W_{in})\).
weight (Tensor) - Set size of kernel is \((K_1, K_2)\), then the shape is \((C_{out}, C_{in}, K_1, K_2)\).

Outputs:

Tensor, the value that applied 2D convolution. The shape is \((N, C_{out}, H_{out}, W_{out})\).

Raises

TypeError – If kernel_size, stride, pad or dilation is neither an int nor a tuple.
TypeError – If out_channel or group is not an int.
ValueError – If kernel_size, stride or dilation is less than 1.
ValueError – If pad_mode is not one of ‘same’, ‘valid’, ‘pad’.
ValueError – If pad is a tuple whose length is not equal to 4.
ValueError – If pad_mode it not equal to ‘pad’ and pad is not equal to (0, 0, 0, 0).
ValueError – If data_format is neither ‘NCHW’ not ‘NHWC’.

Supported Platforms:: Ascend GPU CPU

Examples

>>> input_tensor = Tensor(np.ones([10, 32, 32, 32]), mindspore.float32)
>>> weight = Tensor(np.ones([32, 32, 3, 3]), mindspore.float32)
>>> conv2d = ops.Conv2D(out_channel=32, kernel_size=3)
>>> output = conv2d(input_tensor, weight)
>>> print(output.shape)
(10, 32, 30, 30)