mindspore.dataset.serialize

mindspore.dataset.serialize(dataset, json_filepath='')[source]

Serialize dataset pipeline into a JSON file.

Note

Currently some Python objects are not supported to be serialized. Some examples of unsupported objects are callable user-defined Python functions (Python UDFs) and GeneratorDataset. For such unsupported objects, partially serialized JSON output is produced for which later deserialization, pipeline execution of the deserialized JSON file and/or re-serialization of the deserialized pipeline may result in an error. For example, serialization of callable user-defined Python functions (Python UDFs) is not supported, and a warning results on serialization. Any produced serialized JSON file output for this dataset pipeline is not valid to be deserialized.

Parameters
  • dataset (Dataset) – The starting node.

  • json_filepath (str) – The filepath where a serialized JSON file will be generated (default=””).

Returns

Dict, The dictionary contains the serialized dataset graph.

Raises

OSError – Cannot open a file

Examples

>>> dataset = ds.MnistDataset(mnist_dataset_dir, num_samples=100)
>>> one_hot_encode = transforms.OneHot(10)  # num_classes is input argument
>>> dataset = dataset.map(operations=one_hot_encode, input_columns="label")
>>> dataset = dataset.batch(batch_size=10, drop_remainder=True)
>>> # serialize it to JSON file
>>> serialized_data = ds.serialize(dataset, json_filepath="/path/to/mnist_dataset_pipeline.json")