mindspore.dataset.Dataset.save
- Dataset.save(file_name, num_files=1, file_type='mindrecord')[source]
Save the dynamic data processed by the dataset pipeline in common dataset format. Supported dataset formats:
'mindrecord'
only. And you can usemindspore.dataset.MindDataset
API to read the saved file(s).Implicit type casting exists when saving data as
'mindrecord'
. The transform table shows how to do type casting. Type in dataset
Type in mindrecord
Details
bool
None
Not supported
int8
int32
uint8
bytes(1D uint8)
Drop dimension
int16
int32
uint16
int32
int32
int32
uint32
int64
int64
int64
uint64
None
Not supported
float16
float32
float32
float32
float64
float64
string
string
Multi-dimensional string not supported
Note
To save the samples in order, set dataset’s shuffle to
False
and num_files to1
.Before calling the function, do not use batch operation, repeat operation or data augmentation operations with random attribute in map operation.
When array dimension is variable, one-dimensional arrays or multi-dimensional arrays with variable dimension 0 are supported.
MindRecord does not support uint64, multi-dimensional uint8(drop dimension) nor multi-dimensional string.
- Parameters
Examples
>>> import mindspore.dataset as ds >>> import numpy as np >>> >>> def generator_1d(): ... for i in range(10): ... yield (np.array([i]),) >>> >>> # apply dataset operations >>> d1 = ds.GeneratorDataset(generator_1d, ["data"], shuffle=False) >>> d1.save('/path/to/save_file')