mindspore.ops.MaxPoolWithArgmax

class mindspore.ops.MaxPoolWithArgmax(*args, **kwargs)[source]

Performs max pooling on the input Tensor and returns both max values and indices.

Typically the input is of shape \((N_{in}, C_{in}, H_{in}, W_{in})\), MaxPool outputs regional maximum in the \((H_{in}, W_{in})\)-dimension. Given kernel size \(ks = (h_{ker}, w_{ker})\) and stride \(s = (s_0, s_1)\), the operation is as follows.

\[\text{output}(N_i, C_j, h, w) = \max_{m=0, \ldots, h_{ker}-1} \max_{n=0, \ldots, w_{ker}-1} \text{input}(N_i, C_j, s_0 \times h + m, s_1 \times w + n)\]
Parameters
  • kernel_size (Union[int, tuple[int]]) – The size of kernel used to take the maximum value and arg value, is an int number that represents height and width are both kernel_size, or a tuple of two int numbers that represent height and width respectively. Default: 1.

  • strides (Union[int, tuple[int]]) – The distance of kernel moving, an int number that represents the height and width of movement are both strides, or a tuple of two int numbers that represent height and width of movement respectively. Default: 1.

  • pad_mode (str) –

    The optional value for pad mode, is “same” or “valid”, not case sensitive. Default: “valid”.

    • same: Adopts the way of completion. The height and width of the output will be the same as the input. The total number of padding will be calculated in horizontal and vertical directions and evenly distributed to top and bottom, left and right if possible. Otherwise, the last extra padding will be done from the bottom and the right side.

    • valid: Adopts the way of discarding. The possible largest height and width of output will be returned without padding. Extra pixels will be discarded.

Inputs:
  • input (Tensor) - Tensor of shape \((N, C_{in}, H_{in}, W_{in})\). Data type must be float16 or float32.

Outputs:

Tuple of 2 Tensors, representing the maxpool result and where the max values are generated.

  • output (Tensor) - Maxpooling result, with shape \((N, C_{out}, H_{out}, W_{out})\). It has the same data type as input.

  • mask (Tensor) - Max values’ index represented by the mask. Data type is int32.

Raises
  • TypeError – If the input data type is not float16 or float32.

  • TypeError – If kernel_size or strides is neither an int nor a tuple.

  • TypeError – If input is not a Tensor.

Supported Platforms:

Ascend GPU

Examples

>>> input_tensor = Tensor(np.arange(1 * 3 * 3 * 4).reshape((1, 3, 3, 4)), mindspore.float32)
>>> maxpool_arg_op = ops.MaxPoolWithArgmax(pad_mode="VALID", kernel_size=2, strides=1)
>>> output_tensor, argmax = maxpool_arg_op(input_tensor)
>>> print(output_tensor)
[[[[ 5.  6.  7.]
   [ 9. 10. 11.]]
  [[17. 18. 19.]
   [21. 22. 23.]]
  [[29. 30. 31.]
   [33. 34. 35.]]]]