mindspore.dataset.serialize
- mindspore.dataset.serialize(dataset, json_filepath='')[source]
Serialize dataset pipeline into a JSON file.
Note
Complete serialization of Python objects is not currently supported. Scenarios that are not supported include data pipelines that use GeneratorDataset or map or batch operations that contain custom Python functions. For Python objects, serialization operations do not yield the full object content, which means that deserialization of the JSON file obtained by serialization may result in errors. For example, when serializing the data pipeline of Python user-defined functions, a related warning message is reported and the obtained JSON file cannot be deserialized into a usable data pipeline.
- Parameters
dataset (Dataset) – The starting node.
json_filepath (str) – The filepath where a serialized JSON file will be generated. Default:
''
.
- Returns
Dict, the dictionary contains the serialized dataset graph.
- Raises
OSError – Cannot open a file.
Examples
>>> import mindspore.dataset as ds >>> import mindspore.dataset.transforms as transforms >>> >>> mnist_dataset_dir = "/path/to/mnist_dataset_directory" >>> dataset = ds.MnistDataset(mnist_dataset_dir, num_samples=100) >>> one_hot_encode = transforms.OneHot(10) # num_classes is input argument >>> dataset = dataset.map(operations=one_hot_encode, input_columns="label") >>> dataset = dataset.batch(batch_size=10, drop_remainder=True) >>> # serialize it to JSON file >>> serialized_data = ds.serialize(dataset, json_filepath="/path/to/mnist_dataset_pipeline.json")