Function Differences with tf.data.Dataset.shuffle
tf.data.Dataset.shuffle
tf.data.Dataset.shuffle(
buffer_size,
seed=None,
reshuffle_each_iteration=None
)
For more information, see tf.data.Dataset.shuffle.
mindspore.dataset.GeneratorDataset.shuffle
mindspore.dataset.GeneratorDataset.shuffle(
buffer_size
)
For more information, see mindspore.dataset.GeneratorDataset.shuffle.
Differences
TensorFlow: Randomly shuffle the data in the pipeline. It supports setting a random seed and whether to reshuffle at each iteration.
MindSpore: Randomly shuffle the data in the pipeline. The global random seed can be set through mindspore.dataset.config.set_seed
, and it will reshuffe every iteration.
Code Example
# The following implements shuffle with MindSpore.
import numpy as np
import mindspore.dataset as ds
ds.config.set_seed(57)
data = np.array([[1, 2], [3, 4], [5, 6]])
dataset = ds.NumpySlicesDataset(data=data, column_names=["data"], shuffle=False)
dataset = dataset.shuffle(2)
for item in dataset.create_dict_iterator():
print(item["data"])
# [1 2]
# [5 6]
# [3 4]
# The following implements shuffle with TensorFlow.
import tensorflow as tf
tf.compat.v1.enable_eager_execution()
data = tf.constant([[1, 2], [3, 4], [5, 6]])
dataset = tf.data.Dataset.from_tensor_slices(data)
dataset = dataset.shuffle(2, seed=57)
for value in dataset.take(3):
print(value)
# [3 4]
# [5 6]
# [1 2]