mindspore.ops.ROIAlign

class mindspore.ops.ROIAlign(pooled_height, pooled_width, spatial_scale, sample_num=2, roi_end_mode=1)[源代码]

感兴趣区域对齐(RoI Align)运算。

RoI Align通过在特征图上对附近网格点进行双线性插值计算每个采样点。RoI Align不对RoI、其子区域或采样点的中任何坐标执行量化。参阅论文 Mask R-CNN 。

参数：

pooled_height (int) - 输出特征高度。
pooled_width (int) - 输出特征宽度。
spatial_scale (float) - 缩放系数，将原始图像坐标映射到输入特征图坐标。设RoI的高度在原始图像中为 ori_h ，在输入特征图中为 fea_h ，则 spatial_scale 应为 fea_h / ori_h 。
sample_num (int) - 采样数。默认值：2。
roi_end_mode (int) - 值必须为0或1。如果值为0，则使用该算子的历史实现。如果值为1，则对RoI末尾的像素进行偏移，偏移量为 +1*spatial_scale 。默认值：1。

输入：

features (Tensor) - 输入特征，shape: $(N, C, H, W)$ 。
rois (Tensor) - shape: $(r o i s_n, 5)$ 。数据类型支持float16和float32。 rois_n 为RoI的数量。第二个维度的大小必须为 5 ，分别代表 $(i m a g e_i n d e x, t o p_l e f t_x, t o p_l e f t_y, b o t t o m_r i g h t_x, b o t t o m_r i g h t_y)$ 。 image_index 表示图像的索引； top_left_x 和 top_left_y 分别对应RoI左上角坐标的 x 和 y 值； bottom_right_x 和 bottom_right_y 分别对应RoI右下角坐标的 x 和 y 值。

输出：

Tensor，shape: $(r o i s_n, C, p o o l e d_h e i g h t, p o o l e d_w i d t h)$ 。

异常：

TypeError - pooled_height 、pooled_width 、sample_num 或 roi_end_mode 不是int类型。
TypeError - spatial_scale 不是float类型。
TypeError - features 或 rois 不是Tensor。

支持平台：: Ascend GPU CPU

样例：

>>> features = Tensor(np.array([[[[1., 2.], [3., 4.]]]]), mindspore.float32)
>>> rois = Tensor(np.array([[0, 0.2, 0.3, 0.2, 0.3]]), mindspore.float32)
>>> roi_align = ops.ROIAlign(2, 2, 0.5, 2)
>>> output = roi_align(features, rois)
>>> print(output)
[[[[1.775 2.025]
   [2.275 2.525]]]]