mindspore.dataset.vision
This module is to support vision augmentations. Some image augmentations are implemented with C++ OpenCV to provide high performance. Other additional image augmentations are developed with Python PIL.
Common imported modules in corresponding API examples are as follows:
import mindspore.dataset as ds
import mindspore.dataset.vision as vision
import mindspore.dataset.vision.utils as utils
Note: Legacy c_transforms and py_transforms are deprecated but can still be imported as follows:
import mindspore.dataset.vision.c_transforms as c_vision
import mindspore.dataset.vision.py_transforms as py_vision
See Image Data Processing and Augmentation tutorial for more details.
Descriptions of common data processing terms are as follows:
TensorOperation, the base class of all data processing operations implemented in C++.
ImageTensorOperation, the base class of all image processing operations. It is a derived class of TensorOperation.
PyTensorOperation, the base class of all data processing operations implemented in Python.
The data transform operator can be executed in the data processing pipeline or in the eager mode:
Pipeline mode is generally used to process datasets. For examples, please refer to introduction to data processing pipeline.
Eager mode is generally used for scattered samples. Examples of image preprocessing are as follows:
import numpy as np import mindspore.dataset.vision as vision from PIL import Image,ImageFont,ImageDraw # draw circle img = Image.new("RGB", (300, 300), (255, 255, 255)) draw = ImageDraw.Draw(img) draw.ellipse(((0, 0), (100, 100)), fill=(255, 0, 0), outline=(255, 0, 0), width=5) img.save("./1.jpg") with open("./1.jpg", "rb") as f: data = f.read() data_decoded = vision.Decode()(data) data_croped = vision.RandomCrop(size=(250, 250))(data_decoded) data_resized = vision.Resize(size=(224, 224))(data_croped) data_normalized = vision.Normalize(mean=[0.485 * 255, 0.456 * 255, 0.406 * 255], std=[0.229 * 255, 0.224 * 255, 0.225 * 255])(data_resized) data_hwc2chw = vision.HWC2CHW()(data_normalized) print("data: {}, shape: {}".format(data_hwc2chw, data_hwc2chw.shape), flush=True)
Transforms
Apply gamma correction on input image. |
|
Apply AutoAugment data augmentation method based on AutoAugment: Learning Augmentation Strategies from Data. |
|
Apply automatic contrast on input image. |
|
Apply a given image processing operation on a random selection of bounding box regions of a given image. |
|
Crop the input image at the center to the given size. |
|
Change the color space of the image. |
|
Crop the input image at a specific location. |
|
Apply CutMix transformation on input batch of images and labels. |
|
Randomly cut (mask) out a given number of square patches from the input image array. |
|
Decode the input image in RGB mode. |
|
Apply histogram equalization on input image. |
|
Crop the given image into one central crop and four corners. |
|
Blur input image with the specified Gaussian kernel. |
|
Convert the input PIL Image to grayscale. |
|
Flip the input image horizontally. |
|
Convert the input numpy.ndarray images from HSV to RGB. |
|
Transpose the input image from shape (H, W, C) to (C, H, W). |
|
Apply invert on input image in RGB mode. |
|
Linearly transform the input numpy.ndarray image with a square transformation matrix and a mean vector. |
|
Randomly mix up a batch of numpy.ndarray images together with its labels. |
|
Apply MixUp transformation on input batch of images and labels. |
|
Normalize the input image with respect to mean and standard deviation. |
|
Normalize the input image with respect to mean and standard deviation then pad an extra channel with value zero. |
|
Pad the image according to padding parameters. |
|
Pad the image to a fixed size. |
|
Randomly adjust the sharpness of the input image with a given probability. |
|
Apply Random affine transformation to the input image. |
|
Automatically adjust the contrast of the image with a given probability. |
|
Adjust the color of the input image by a fixed or random degree. |
|
Randomly adjust the brightness, contrast, saturation, and hue of the input image. |
|
Crop the input image at a random location. |
|
A combination of Crop, Decode and Resize. |
|
Crop the input image at a random location and adjust bounding boxes accordingly. |
|
Apply histogram equalization on the input image with a given probability. |
|
Randomly erase pixels within a random selected rectangle erea on the input numpy.ndarray image. |
|
Randomly convert the input PIL Image to grayscale. |
|
Randomly flip the input image horizontally with a given probability. |
|
Flip the input image horizontally randomly with a given probability and adjust bounding boxes accordingly. |
|
Randomly invert the colors of image with a given probability. |
|
Add AlexNet-style PCA-based noise to an image. |
|
Randomly apply perspective transformation to the input PIL Image with a given probability. |
|
Reduce the number of bits for each color channel to posterize the input image randomly with a given probability. |
|
This operator will crop the input image randomly, and resize the cropped image using a selected interpolation mode |
|
Crop the input image to a random size and aspect ratio and adjust bounding boxes accordingly. |
|
Resize the input image using |
|
Tensor operation to resize the input image using a randomly selected interpolation mode |
|
Rotate the input image randomly within a specified range of degrees. |
|
Choose a random sub-policy from a policy list to be applied on the input image. |
|
Adjust the sharpness of the input image by a fixed or random degree. |
|
Randomly selects a subrange within the specified threshold range and sets the pixel value within the subrange to (255 - pixel). |
|
Randomly flip the input image vertically with a given probability. |
|
Flip the input image vertically, randomly with a given probability and adjust bounding boxes accordingly. |
|
Rescale the input image with the given rescale and shift. |
|
Resize the input image to the given size with a given interpolation mode |
|
Resize the input image to the given size and adjust bounding boxes accordingly. |
|
Convert the input numpy.ndarray images from RGB to HSV. |
|
Rotate the input image by specified degrees. |
|
Slice Tensor to multiple patches in horizontal and vertical directions. |
|
Crop the given image into one central crop and four corners with the flipped version of these. |
|
Convert the PIL input image to numpy.ndarray image. |
|
Convert the input decoded numpy.ndarray image to PIL Image. |
|
Convert the input PIL Image or numpy.ndarray to numpy.ndarray of the desired dtype, rescale the pixel value range from [0, 255] to [0.0, 1.0] and change the shape from (H, W, C) to (C, H, W). |
|
Cast the input to a given MindSpore data type or NumPy data type. |
|
Uniformly select a number of transformations from a sequence and apply them sequentially and randomly, which means that there is a chance that a chosen transformation will not be applied. |
|
Flip the input image vertically. |
Utilities
AutoAugment policy for different datasets. |
|
Padding Mode, Border Type. |
|
The color conversion mode. |
|
Data Format of images after batch operation. |
|
Interpolation Modes. |
|
Mode to Slice Tensor into multiple parts. |
|
Get the number of input image channels. |
|
Get the size of input image as [height, width]. |