mindspore.dataset.vision.RandomResizedCropWithBBox

class mindspore.dataset.vision.RandomResizedCropWithBBox(size, scale=(0.08, 1.0), ratio=(3. / 4., 4. / 3.), interpolation=Inter.BILINEAR, max_attempts=10)[source]

Crop the input image to a random size and aspect ratio and adjust bounding boxes accordingly.

Parameters

size (Union[int, Sequence[int]]) – The size of the output image. The size value(s) must be positive. If size is an integer, a square crop of size (size, size) is returned. If size is a sequence of length 2, it should be (height, width).
scale (Union[list, tuple], optional) – Range (min, max) of respective size of the original size to be cropped, which must be non-negative. Default: (0.08, 1.0).
ratio (Union[list, tuple], optional) – Range (min, max) of aspect ratio to be cropped, which must be non-negative. Default: (3. / 4., 4. / 3.).
interpolation (Inter, optional) – Image interpolation method defined by Inter . Default: Inter.BILINEAR.
max_attempts (int, optional) – The maximum number of attempts to propose a valid crop area. Default: 10. If exceeded, fall back to use center crop instead.

Raises

TypeError – If size is not of type int or Sequence[int].
TypeError – If scale is not of type tuple.
TypeError – If ratio is not of type tuple.
TypeError – If interpolation is not of type Inter.
TypeError – If max_attempts is not of type integer.
ValueError – If size is not positive.
ValueError – If scale is negative.
ValueError – If ratio is negative.
ValueError – If max_attempts is not positive.
RuntimeError – If given tensor shape is not <H, W> or <H, W, C>.

Supported Platforms:: CPU

Examples

>>> import numpy as np
>>> import mindspore.dataset as ds
>>> import mindspore.dataset.vision as vision
>>> from mindspore.dataset.vision import Inter
>>>
>>> # Use the transform in dataset pipeline mode
>>> data = np.random.randint(0, 255, size=(100, 100, 3)).astype(np.float32)
>>> numpy_slices_dataset = ds.NumpySlicesDataset(data, ["image"])
>>> func = lambda img: (data, np.array([[0, 0, data.shape[1], data.shape[0]]]).astype(np.float32))
>>> numpy_slices_dataset = numpy_slices_dataset.map(operations=[func],
...                                                 input_columns=["image"],
...                                                 output_columns=["image", "bbox"])
>>> bbox_op = vision.RandomResizedCropWithBBox(size=50, interpolation=Inter.NEAREST)
>>> transforms_list = [bbox_op]
>>> numpy_slices_dataset = numpy_slices_dataset.map(operations=transforms_list,
...                                                 input_columns=["image", "bbox"])
>>> for item in numpy_slices_dataset.create_dict_iterator(num_epochs=1, output_numpy=True):
...     print(item["image"].shape, item["image"].dtype)
...     print(item["bbox"].shape, item["bbox"].dtype)
...     break
(50, 50, 3) float32
(1, 4) float32
>>>
>>> # Use the transform in eager mode
>>> data = np.random.randint(0, 255, size=(100, 100, 3)).astype(np.float32)
>>> func = lambda img: (data, np.array([[0, 0, data.shape[1], data.shape[0]]]).astype(data.dtype))
>>> func_data, func_bboxes = func(data)
>>> output = vision.RandomResizedCropWithBBox((16, 64), (0.5, 0.5), (0.5, 0.5))(func_data, func_bboxes)
>>> print(output[0].shape, output[0].dtype)
(16, 64, 3) float32
>>> print(output[1].shape, output[1].dtype)
(1, 4) float32

Tutorial Examples:

Illustration of vision transforms