视觉变换样例库
此指南展示了mindpore.dataset.vision模块中各种变换的用法。
环境准备
[1]:
from download import download
import matplotlib.pyplot as plt
from PIL import Image
import mindspore.dataset as ds
import mindspore.dataset.vision as vision
# Download opensource datasets
url = "https://mindspore-website.obs.cn-north-4.myhuaweicloud.com/notebook/datasets/flamingos.jpg"
download(url, './flamingos.jpg', replace=True)
orig_img = Image.open('flamingos.jpg')
# Env set for randomness and prepare plot function
ds.config.set_seed(66)
def plot(imgs, first_origin=True, **kwargs):
num_rows = 1
num_cols = len(imgs) + first_origin
_, axs = plt.subplots(nrows=num_rows, ncols=num_cols, squeeze=False)
if first_origin:
imgs = [orig_img] + imgs
for idx, img in enumerate(imgs):
ax = axs[0, idx]
ax.imshow(img, **kwargs)
ax.set(xticklabels=[], yticklabels=[], xticks=[], yticks=[])
if first_origin:
axs[0, 0].set(title='Original image')
axs[0, 0].title.set_size(8)
plt.tight_layout()
Downloading data from https://mindspore-website.obs.cn-north-4.myhuaweicloud.com/notebook/datasets/flamingos.jpg (45 kB)
file_sizes: 100%|███████████████████████████| 45.8k/45.8k [00:00<00:00, 640kB/s]
Successfully downloaded file to ./flamingos.jpg
几何变换
几何图像变换是指改变图像的几何属性,如其形状、大小、方向或位置,其涉及对图像像素或坐标进行数学运算。
Pad
mindspore.dataset.vision.Pad 会对图像的边缘填充像素。
[2]:
padded_imgs = [vision.Pad(padding=padding)(orig_img) for padding in (3, 10, 30, 50)]
plot(padded_imgs)
data:image/s3,"s3://crabby-images/79c02/79c026bf1445fd6131e2189dd3b8b353263b2019" alt="../../../_images/api_python_samples_dataset_vision_gallery_4_0.png"
Resize
mindspore.dataset.vision.Resize 会调整图像的尺寸大小。
[3]:
resized_imgs = [vision.Resize(size=size)(orig_img) for size in (30, 50, 100)]
plot(resized_imgs)
data:image/s3,"s3://crabby-images/884b3/884b3d28972ed79e15a43b7b0787d8dd47d3180e" alt="../../../_images/api_python_samples_dataset_vision_gallery_6_0.png"
CenterCrop
mindspore.dataset.vision.CenterCrop 会在图像中裁剪出中心区域。
[4]:
center_crops = [vision.CenterCrop(size=size)(orig_img) for size in (30, 50, 100)]
plot(center_crops)
data:image/s3,"s3://crabby-images/83ca9/83ca94a176b7f8ef87a6fa9ca1a1ae1f5dd54850" alt="../../../_images/api_python_samples_dataset_vision_gallery_8_0.png"
FiveCrop
mindspore.dataset.vision.FiveCrop 在图像的中心与四个角处分别裁剪指定尺寸大小的子图。
[5]:
(top_left, top_right, bottom_left, bottom_right, center) = vision.FiveCrop(size=(100, 100))(orig_img)
plot([top_left, top_right, bottom_left, bottom_right, center], True)
data:image/s3,"s3://crabby-images/467f7/467f73caf827e337f93d272b70bfc6caa6847acb" alt="../../../_images/api_python_samples_dataset_vision_gallery_10_0.png"
RandomPerspective
mindspore.dataset.vision.RandomPerspective 会按照指定的概率对输入图像进行透视变换。
[6]:
perspective_transformer = vision.RandomPerspective(distortion_scale=0.6, prob=1.0)
perspective_imgs = [perspective_transformer(orig_img) for _ in range(4)]
plot(perspective_imgs)
data:image/s3,"s3://crabby-images/3a3cc/3a3cc4a6635d04136154b968aa582b48b2c157e2" alt="../../../_images/api_python_samples_dataset_vision_gallery_12_0.png"
RandomRotation
mindspore.dataset.vision.RandomRotation 会随机旋转输入图像。
[7]:
rotater = vision.RandomRotation(degrees=(0, 180))
rotated_imgs = [rotater(orig_img) for _ in range(4)]
plot(rotated_imgs)
data:image/s3,"s3://crabby-images/8bdc4/8bdc4e3b0ad935355e2c7ed6a7f2ff3d5cfa7564" alt="../../../_images/api_python_samples_dataset_vision_gallery_14_0.png"
RandomAffine
mindspore.dataset.vision.RandomAffine 会对输入图像应用随机仿射变换。
[8]:
affine_transformer = vision.RandomAffine(degrees=(30, 70), translate=(0.1, 0.3), scale=(0.5, 0.75))
affine_imgs = [affine_transformer(orig_img) for _ in range(4)]
plot(affine_imgs)
data:image/s3,"s3://crabby-images/86285/862857ee5639a5b445759116934fc08e0d396217" alt="../../../_images/api_python_samples_dataset_vision_gallery_16_0.png"
RandomCrop
mindspore.dataset.vision.RandomCrop 会对输入图像进行随机区域的裁剪。
[9]:
cropper = vision.RandomCrop(size=(128, 128))
crops = [cropper(orig_img) for _ in range(4)]
plot(crops)
data:image/s3,"s3://crabby-images/2a5f2/2a5f242b59983a8fc8316cce382753b5b7db67a3" alt="../../../_images/api_python_samples_dataset_vision_gallery_18_0.png"
RandomResizedCrop
mindspore.dataset.vision.RandomResizedCrop 会对输入图像进行随机裁剪,并将裁剪区域调整为指定的尺寸大小。
[10]:
resize_cropper = vision.RandomResizedCrop(size=(32, 32))
resized_crops = [resize_cropper(orig_img) for _ in range(4)]
plot(resized_crops)
data:image/s3,"s3://crabby-images/4b324/4b324137035f4a0918eaa717239b9d2c17a7aa38" alt="../../../_images/api_python_samples_dataset_vision_gallery_20_0.png"
光学变换
光学变换是指修改图像的测光属性,如其亮度、对比度、颜色或色调。这些变换的应用是为了改变图像的视觉外观,但保留其几何结构。
Grayscale
mindspore.dataset.vision.Grayscale 会将图像转换为灰度图。
[11]:
gray_img = vision.Grayscale()(orig_img)
plot([gray_img], cmap='gray')
data:image/s3,"s3://crabby-images/65196/651967a25049960350e08e9d230bce294b0276f7" alt="../../../_images/api_python_samples_dataset_vision_gallery_23_0.png"
RandomColorAdjust
mindspore.dataset.vision.RandomColorAdjust 会随机调整输入图像的亮度、对比度、饱和度和色调。
[12]:
jitter = vision.RandomColorAdjust(brightness=.5, hue=.3)
jitted_imgs = [jitter(orig_img) for _ in range(4)]
plot(jitted_imgs)
data:image/s3,"s3://crabby-images/0bc49/0bc49cf05b1c4e5617e595ab78884d223eb70b08" alt="../../../_images/api_python_samples_dataset_vision_gallery_25_0.png"
GaussianBlur
mindspore.dataset.vision.GaussianBlur 会使用指定的高斯核对输入图像进行模糊处理。
[13]:
blurrer = vision.GaussianBlur(kernel_size=(5, 9), sigma=(0.1, 5))
blurred_imgs = [blurrer(orig_img) for _ in range(4)]
plot(blurred_imgs)
data:image/s3,"s3://crabby-images/399ec/399ecdb2f4d730be4be1c10eec5452d4f3dc5255" alt="../../../_images/api_python_samples_dataset_vision_gallery_27_0.png"
RandomInvert
mindspore.dataset.vision.RandomInvert 会以给定的概率随机反转图像的颜色。
[14]:
inverter = vision.RandomInvert()
invertered_imgs = [inverter(orig_img) for _ in range(4)]
plot(invertered_imgs)
data:image/s3,"s3://crabby-images/524c4/524c4ac7f7373d994066ed0882be489be0857842" alt="../../../_images/api_python_samples_dataset_vision_gallery_29_0.png"
RandomPosterize
mindspore.dataset.vision.RandomPosterize 会随机减少图像的颜色通道的比特位数,使图像变得高对比度和颜色鲜艳。
[15]:
posterizer = vision.RandomPosterize(bits=2)
posterized_imgs = [posterizer(orig_img) for _ in range(4)]
plot(posterized_imgs)
data:image/s3,"s3://crabby-images/62ab8/62ab840876bbf2dd28e007b00650ddc5f4214165" alt="../../../_images/api_python_samples_dataset_vision_gallery_31_0.png"
RandomSolarize
mindspore.dataset.vision.RandomSolarize 会随机翻转给定范围内的像素。
[16]:
solarizer = vision.RandomSolarize(threshold=(0, 192))
solarized_imgs = [solarizer(orig_img) for _ in range(4)]
plot(solarized_imgs)
data:image/s3,"s3://crabby-images/da939/da93939aa31140d2aff7b147e49e4a0a98586331" alt="../../../_images/api_python_samples_dataset_vision_gallery_33_0.png"
RandomAdjustSharpness
mindspore.dataset.vision.RandomAdjustSharpness 会以给定的概率随机调整输入图像的锐度。
[17]:
sharpness_adjuster = vision.RandomAdjustSharpness(degree=2)
sharpened_imgs = [sharpness_adjuster(orig_img) for _ in range(4)]
plot(sharpened_imgs)
data:image/s3,"s3://crabby-images/3c985/3c985849309df0360e2c2e86dc038f711fe5093f" alt="../../../_images/api_python_samples_dataset_vision_gallery_35_0.png"
RandomAutoContrast
mindspore.dataset.vision.RandomAutoContrast 会以给定的概率自动调整图像的对比度。
[18]:
autocontraster = vision.RandomAutoContrast()
autocontrasted_imgs = [autocontraster(orig_img) for _ in range(4)]
plot(autocontrasted_imgs)
data:image/s3,"s3://crabby-images/65e06/65e06425137cef5f8e4e3566591978bc9533f032" alt="../../../_images/api_python_samples_dataset_vision_gallery_37_0.png"
RandomEqualize
mindspore.dataset.vision.RandomEqualize 会以给定的概率随机对输入图像进行直方图均衡化。
[19]:
equalizer = vision.RandomEqualize()
equalized_imgs = [equalizer(orig_img) for _ in range(4)]
plot(equalized_imgs)
data:image/s3,"s3://crabby-images/8f18e/8f18ef7f7b0815326ae3b174320fd5bc791a1fe1" alt="../../../_images/api_python_samples_dataset_vision_gallery_39_0.png"
增强变换
以下的变换是多个变换的组合,通常来自论文提出的一些高效数据增强方法。
AutoAugment
mindspore.dataset.vision.AutoAugment 会应用AutoAugment数据增强方法,增强的实现基于基于论文AutoAugment: Learning Augmentation Strategies from Data。
[20]:
augmenter = vision.AutoAugment(policy=vision.AutoAugmentPolicy.IMAGENET)
imgs = [augmenter(orig_img) for _ in range(4)]
plot(imgs)
data:image/s3,"s3://crabby-images/20e1e/20e1eaff0cbb55cb55342f577fc499b304ec1f64" alt="../../../_images/api_python_samples_dataset_vision_gallery_42_0.png"
RandAugment
mindspore.dataset.vision.RandAugment 会对输入图像应用RandAugment数据增强方法,增强的实现基于基于论文RandAugment: Learning Augmentation Strategies from Data。
[21]:
augmenter = vision.RandAugment()
imgs = [augmenter(orig_img) for _ in range(4)]
plot(imgs)
data:image/s3,"s3://crabby-images/c565f/c565f7190ecc7af243eb7a5a2fd8019b80642fd3" alt="../../../_images/api_python_samples_dataset_vision_gallery_44_0.png"
TrivialAugmentWide
mindspore.dataset.vision.TrivialAugmentWide会对输入图像应用TrivialAugmentWide数据增强方法,增强的实现基于基于论文TrivialAugmentWide: Tuning-free Yet State-of-the-Art Data Augmentation。
[22]:
augmenter = vision.TrivialAugmentWide()
imgs = [augmenter(orig_img) for _ in range(4)]
plot(imgs)
data:image/s3,"s3://crabby-images/cc1d4/cc1d4c8c71cfa9718b2897df9fe7045ece2c247a" alt="../../../_images/api_python_samples_dataset_vision_gallery_46_0.png"
随机应用的变换
有些变换是以按照给定概率随机应用的。也就是说,转换后的图像可能与原始图像相同。
RandomHorizontalFlip
mindspore.dataset.vision.RandomHorizontalFlip会对输入图像进行水平随机翻转。
[23]:
hflipper = vision.RandomHorizontalFlip(0.5)
transformed_imgs = [hflipper(orig_img) for _ in range(4)]
plot(transformed_imgs)
data:image/s3,"s3://crabby-images/a80c3/a80c3593d2a6b73c7087a3df4234327d0edde4b2" alt="../../../_images/api_python_samples_dataset_vision_gallery_48_0.png"
RandomVerticalFlip
mindspore.dataset.vision.RandomVerticalFlip 会对输入图像进行垂直随机翻转。
[24]:
vflipper = vision.RandomVerticalFlip(0.5)
transformed_imgs = [vflipper(orig_img) for _ in range(4)]
plot(transformed_imgs)
data:image/s3,"s3://crabby-images/6a32c/6a32cdda8af8cb49480ffda39f2b876b1f8efcd9" alt="../../../_images/api_python_samples_dataset_vision_gallery_50_0.png"
RandomApply
mindspore.dataset.transforms.RandomApply 可以指定一组数据增强处理及其被应用的概率,在运算时按概率随机应用其中的增强处理。
[25]:
import mindspore.dataset.transforms as T
applier = T.RandomApply(transforms=[vision.RandomCrop(size=(64, 64))], prob=0.5)
transformed_imgs = [applier(orig_img) for _ in range(4)]
plot(transformed_imgs)
data:image/s3,"s3://crabby-images/41fab/41faba5e1f56d08fa98dcfcfef9e6f0ba29c2ce0" alt="../../../_images/api_python_samples_dataset_vision_gallery_52_0.png"
在数据Pipeline中加载和处理图像文件
使用 mindspore.dataset.ImageFolderDataset 将磁盘中的图像文件内容加载到数据Pipeline中,并进一步应用其他增强操作。
[26]:
from download import download
import os
import mindspore.dataset as ds
# Download a small imagenet as example
url = "https://mindspore-website.obs.cn-north-4.myhuaweicloud.com/notebook/datasets/imageset.zip"
download(url, "./", kind="zip", replace=True)
# There are 5 classes in the image folder.
os.listdir("./imageset")
# Load these 5 classes into dataset pipeline
dataset = ds.ImageFolderDataset("./imageset", shuffle=False)
# check the column names inside the dataset. "image" column represents the image content and "label" column represents the corresponding label of image.
print("column names:", dataset.get_col_names())
# since the original image is not decoded, apply decode first on "image" column
dataset = dataset.map(vision.Decode(), input_columns=["image"])
# check results
print(">>>>> after decode")
for data, label in dataset:
print(data.shape, label)
# let's do some transforms on dataset
# apply resize on images
dataset = dataset.map(vision.Resize(size=(48, 48)), input_columns=["image"])
# check results
print(">>>>> after resize")
images = []
for image, label in dataset:
images.append(image.asnumpy())
print(image.shape, label)
plot(images[:5], first_origin=False)
Downloading data from https://mindspore-website.obs.cn-north-4.myhuaweicloud.com/notebook/datasets/imageset.zip (45 kB)
file_sizes: 100%|██████████████████████████| 45.7k/45.7k [00:00<00:00, 1.01MB/s]
Extracting zip file...
Successfully downloaded / unzipped to ./
column names: ['image', 'label']
>>>>> after decode
(64, 64, 3) 0
(64, 64, 3) 0
(64, 64, 3) 0
(64, 64, 3) 1
(64, 64, 3) 1
(64, 64, 3) 1
(64, 64, 3) 1
(64, 64, 3) 2
(64, 64, 3) 2
(64, 64, 3) 2
(64, 64, 3) 3
(64, 64, 3) 3
(64, 64, 3) 3
(64, 64, 3) 4
(64, 64, 3) 4
(64, 64, 3) 4
>>>>> after resize
(48, 48, 3) 0
(48, 48, 3) 0
(48, 48, 3) 0
(48, 48, 3) 1
(48, 48, 3) 1
(48, 48, 3) 1
(48, 48, 3) 1
(48, 48, 3) 2
(48, 48, 3) 2
(48, 48, 3) 2
(48, 48, 3) 3
(48, 48, 3) 3
(48, 48, 3) 3
(48, 48, 3) 4
(48, 48, 3) 4
(48, 48, 3) 4
data:image/s3,"s3://crabby-images/9f31b/9f31b3d24243c92ccd57262cac7ec154bc58c99b" alt="../../../_images/api_python_samples_dataset_vision_gallery_54_1.png"