mindspore.nn.MaxPool3d

class mindspore.nn.MaxPool3d(kernel_size=1, stride=1, pad_mode='valid', padding=0, dilation=1, return_indices=False, ceil_mode=False)[source]

3D max pooling operation.

Applies a 3D max pooling over an input Tensor which can be regarded as a composition of 3D planes.

Typically the input is of shape \((N_{in}, C_{in}, D_{in}, H_{in}, W_{in})\), MaxPool outputs regional maximum in the \((D_{in}, H_{in}, W_{in})\)-dimension. Given kernel size is \(ks = (d_{ker}, h_{ker}, w_{ker})\) and stride is \(s = (s_0, s_1, s_2)\), the operation is as follows.

\[\text{output}(N_i, C_j, d, h, w) = \max_{l=0, \ldots, d_{ker}-1} \max_{m=0, \ldots, h_{ker}-1} \max_{n=0, \ldots, w_{ker}-1} \text{input}(N_i, C_j, s_0 \times d + l, s_1 \times h + m, s_2 \times w + n)\]

Parameters

kernel_size (Union[int, tuple[int]]) – The size of kernel used to take the maximum value, is an int number or a single element tuple that represents depth, height and width of the kernel, or a tuple of three int numbers that represent depth, height and width respectively. The value must be a positive integer. Default: 1 .
stride (Union[int, tuple[int]]) – The moving stride of pooling operation, an int number or a single element tuple that represents the moving stride of pooling kernel in the directions of depth, height and the width, or a tuple of three int numbers that represent depth, height and width of movement respectively. The value must be a positive integer. If the value is None, the default value kernel_size is used. Default: 1 .
pad_mode (str, optional) –
Specifies the padding mode with a padding value of 0. It can be set to: "same" , "valid" or "pad" . Default: "valid" .
- "same": Pad the input around its depth/height/width dimension so that the shape of input and output are the same when stride is set to 1. The amount of padding to is calculated by the operator internally. If the amount is even, it isuniformly distributed around the input, if it is odd, the excess amount goes to the front/right/bottom side. If this mode is set, padding must be 0.
- "valid": No padding is applied to the input, and the output returns the maximum possible depth, height and width. Extra pixels that could not complete a full stride will be discarded. If this mode is set, padding must be 0.
- "pad": Pad the input with a specified amount. In this mode, the amount of padding in the depth, height and width dimension is determined by the padding parameter. If this mode is set, padding must be greater than or equal to 0.
padding (Union(int, tuple[int], list[int])) – Pooling padding value. Default: 0 . padding can only be an integer or a tuple/list containing one or three integers. If padding is an integer or a tuple/list containing one integer, it will be padded in six directions of front, back, top, bottom, left and right of the input. If padding is a tuple/list containing three integers, it will be padded in front and back of the input padding[0] times, up and down padding[1] times, and left and right of the input padding[2] times.
dilation (Union(int, tuple[int])) – The spacing between the elements of the kernel in convolution, used to increase the receptive field of the pooling operation. If it is a tuple, it must contain one or three integers. Default: 1 .
return_indices (bool) – If True , output is a Tuple of 2 Tensors, representing the maxpool result and where the max values are generated. Otherwise, only the maxpool result is returned. Default: False .
ceil_mode (bool) – If True, use ceil to calculate output shape. If False, use ceil to calculate output shape. Default: False .

Inputs:

x (Tensor) - Tensor of shape \((N_{in}, C_{in}, D_{in}, H_{in}, W_{in})\) or \((C_{in}, D_{in}, H_{in}, W_{in})\).

Outputs:

If return_indices is False, output is a Tensor, with shape \((N_{out}, C_{out}, D_{out}, H_{out}, W_{out})\) or \((C_{out}, D_{out}, H_{out}, W_{out})\). It has the same data type as x.

If return_indices is True, output is a Tuple of 2 Tensors, representing the maxpool result and where the max values are generated.

output (Tensor) - Maxpooling result, with shape \((N_{out}, C_{out}, D_{out}, H_{out}, W_{out})\) or \((C_{out}, D_{out}, H_{out}, W_{out})\). It has the same data type as x.
argmax (Tensor) - Index corresponding to the maximum value. Data type is int64.

If pad_mode is in "pad" mode, the output shape calculation formula is as follows:

\[D_{out} = \left\lfloor\frac{D_{in} + 2 \times \text{padding}[0] - \text{dilation}[0] \times (\text{kernel_size}[0] - 1) - 1}{\text{stride}[0]} + 1\right\rfloor\]

\[H_{out} = \left\lfloor\frac{H_{in} + 2 \times \text{padding}[1] - \text{dilation}[1] \times (\text{kernel_size}[1] - 1) - 1}{\text{stride}[1]} + 1\right\rfloor\]

\[W_{out} = \left\lfloor\frac{W_{in} + 2 \times \text{padding}[2] - \text{dilation}[2] \times (\text{kernel_size}[2] - 1) - 1}{\text{stride}[2]} + 1\right\rfloor\]

Raises

ValueError – If length of shape of x is not equal to 4 or 5.
TypeError – If kernel_size , stride , padding or dilation is neither an int nor a tuple.
ValueError – If kernel_size or stride is less than 1.
ValueError – If the padding parameter is neither an integer nor a tuple of length 3.
ValueError – If pad_mode is not set to "pad", setting return_indices to True or dilation to a value other than 1.
ValueError – If padding is non-zero when pad_mode is not "pad".

Supported Platforms:: Ascend GPU CPU

Examples

>>> import mindspore as ms
>>> import mindspore.nn as nn
>>> import numpy as np
>>> np_x = np.random.randint(0, 10, [5, 3, 4, 6, 7])
>>> x = Tensor(np_x, ms.float32)
>>> pool1 = nn.MaxPool3d(kernel_size=2, stride=1, pad_mode="pad", padding=1, dilation=3, return_indices=True)
>>> output = pool1(x)
>>> print(output[0].shape)
(5, 3, 3, 5, 6)
>>> print(output[1].shape)
(5, 3, 3, 5, 6)
>>> pool2 = nn.MaxPool3d(kernel_size=2, stride=1, pad_mode="pad", padding=1, dilation=3, return_indices=False)
>>> output2 = pool2(x)
>>> print(output2.shape)
(5, 3, 3, 5, 6)