mindspore.dataset.vision.RandomCropDecodeResize

class mindspore.dataset.vision.RandomCropDecodeResize(size, scale=(0.08, 1.0), ratio=(3. / 4., 4. / 3.), interpolation=Inter.BILINEAR, max_attempts=10)[source]

A combination of Crop , Decode and Resize . It will get better performance for JPEG images. This operation will crop the input image at a random location, decode the cropped image in RGB mode, and resize the decoded image.

Parameters

size (Union[int, Sequence[int]]) – The output size of the resized image. The size value(s) must be positive. If size is an integer, a square crop of size (size, size) is returned. If size is a sequence of length 2, it should be (height, width).
scale (Union[list, tuple], optional) – Range [min, max) of respective size of the original size to be cropped, which must be non-negative. Default: (0.08, 1.0).
ratio (Union[list, tuple], optional) – Range [min, max) of aspect ratio to be cropped, which must be non-negative. Default: (3. / 4., 4. / 3.).
interpolation (Inter, optional) – Image interpolation method defined by Inter . Default: Inter.BILINEAR.
max_attempts (int, optional) – The maximum number of attempts to propose a valid crop_area. Default: 10. If exceeded, fall back to use center_crop instead. The max_attempts value must be positive.

Raises

TypeError – If size is not of type int or Sequence[int].
TypeError – If scale is not of type tuple.
TypeError – If ratio is not of type tuple.
TypeError – If interpolation is not of type Inter .
TypeError – If max_attempts is not of type integer.
ValueError – If size is not positive.
ValueError – If scale is negative.
ValueError – If ratio is negative.
ValueError – If max_attempts is not positive.
RuntimeError – If given tensor is not a 1D sequence.

Supported Platforms:: CPU

Examples

>>> import os
>>> import numpy as np
>>> from PIL import Image, ImageDraw
>>> import mindspore.dataset as ds
>>> import mindspore.dataset.vision as vision
>>> from mindspore.dataset.vision import Inter
>>>
>>> # Use the transform in dataset pipeline mode
>>> class MyDataset:
...     def __init__(self):
...         self.data = []
...         img = Image.new("RGB", (300, 300), (255, 255, 255))
...         draw = ImageDraw.Draw(img)
...         draw.ellipse(((0, 0), (100, 100)), fill=(255, 0, 0), outline=(255, 0, 0), width=5)
...         img.save("./1.jpg")
...         data = np.fromfile("./1.jpg", np.uint8)
...         self.data.append(data)
...
...     def __getitem__(self, index):
...         return self.data[0]
...
...     def __len__(self):
...         return 5
>>>
>>> my_dataset = MyDataset()
>>> generator_dataset = ds.GeneratorDataset(my_dataset, column_names="image")
>>> resize_crop_decode_op = vision.RandomCropDecodeResize(size=(50, 75),
...                                                       scale=(0.25, 0.5),
...                                                       interpolation=Inter.NEAREST,
...                                                       max_attempts=5)
>>> transforms_list = [resize_crop_decode_op]
>>> generator_dataset = generator_dataset.map(operations=transforms_list, input_columns=["image"])
>>> for item in generator_dataset.create_dict_iterator(num_epochs=1, output_numpy=True):
...     print(item["image"].shape, item["image"].dtype)
...     break
(50, 75, 3) uint8
>>> os.remove("./1.jpg")
>>>
>>> # Use the transform in eager mode
>>> img = Image.new("RGB", (300, 300), (255, 255, 255))
>>> draw = ImageDraw.Draw(img)
>>> draw.polygon([(50, 50), (150, 50), (100, 150)], fill=(0, 255, 0), outline=(0, 255, 0))
>>> img.save("./2.jpg")
>>> data = np.fromfile("./2.jpg", np.uint8)
>>> output = vision.RandomCropDecodeResize(size=(50, 75), scale=(0, 10.0), ratio=(0.5, 0.5),
...                                        interpolation=Inter.BILINEAR, max_attempts=1)(data)
>>> print(np.array(output).shape, np.array(output).dtype)
(50, 75, 3) uint8
>>> os.remove("./2.jpg")

Tutorial Examples:

Illustration of vision transforms