mindspore.mindrecord
Introduction to mindrecord:
Mindrecord is a module to implement reading, writing, search and converting for MindSpore format dataset. Users could load(modify) mindrecord data through FileReader(FileWriter). Users could also convert other format dataset to mindrecord data through corresponding sub-module.
- class mindspore.mindrecord.Cifar100ToMR(source, destination)[source]
Class is for transformation from cifar100 to MindRecord.
- Parameters
- Raises
ValueError – If source or destination is invalid.
- class mindspore.mindrecord.Cifar10ToMR(source, destination)[source]
Class is for transformation from cifar10 to MindRecord.
- Parameters
- Raises
ValueError – If source or destination is invalid.
- class mindspore.mindrecord.FileReader(file_name, num_consumer=4, columns=None, operator=None)[source]
Class to read MindRecord File series.
- Parameters
file_name (str, list[str]) – One of MindRecord File or file list.
num_consumer (int, optional) – Number of consumer threads which load data to memory (default=4). It should not be smaller than 1 or larger than the number of CPU.
columns (list[str], optional) – List of fields which correspond data would be read (default=None).
operator (int, optional) – Reserved parameter for operators (default=None).
- Raises
ParamValueError – If file_name, num_consumer or columns is invalid.
- class mindspore.mindrecord.FileWriter(file_name, shard_num=1)[source]
Class to write user defined raw data into MindRecord File series.
- Parameters
- Raises
ParamValueError – If file_name or shard_num is invalid.
- add_index(index_fields)[source]
Select index fields from schema to accelerate reading.
- Parameters
index_fields (list[str]) – Fields would be set as index which should be primitive type.
- Returns
MSRStatus, SUCCESS or FAILED.
- Raises
ParamTypeError – If index field is invalid.
MRMDefineIndexError – If index field is not primitive type.
MRMAddIndexError – If failed to add index field.
MRMGetMetaError – If the schema is not set or get meta failed.
- add_schema(content, desc=None)[source]
Returns a schema id if added schema successfully, or raise exception.
- commit()[source]
Flush data to disk and generate the correspond db files.
- Returns
MSRStatus, SUCCESS or FAILED.
- Raises
MRMOpenError – If failed to open MindRecord File.
MRMSetHeaderError – If failed to set header.
MRMIndexGeneratorError – If failed to create index generator.
MRMGenerateIndexError – If failed to write to database.
MRMCommitError – If failed to flush data to disk.
- classmethod open_for_append(file_name)[source]
Open MindRecord file and get ready to append data.
- Parameters
file_name (str) – String of MindRecord file name.
- Returns
Instance of FileWriter.
- Raises
ParamValueError – If file_name is invalid.
FileNameError – If path contains invalid character.
MRMOpenError – If failed to open MindRecord File.
MRMOpenForAppendError – If failed to open file for appending data.
- set_header_size(header_size)[source]
Set the size of header.
- Parameters
header_size (int) – Size of header, between 16KB and 128MB.
- Returns
MSRStatus, SUCCESS or FAILED.
- Raises
MRMInvalidHeaderSizeError – If failed to set header size.
- set_page_size(page_size)[source]
Set the size of Page.
- Parameters
page_size (int) – Size of page, between 32KB and 256MB.
- Returns
MSRStatus, SUCCESS or FAILED.
- Raises
MRMInvalidPageSizeError – If failed to set page size.
- write_raw_data(raw_data, parallel_writer=False)[source]
Write raw data and generate sequential pair of MindRecord File and validate data based on predefined schema by default.
- Parameters
- Raises
ParamTypeError – If index field is invalid.
MRMOpenError – If failed to open MindRecord File.
MRMValidateDataError – If data does not match blob fields.
MRMSetHeaderError – If failed to set header.
MRMWriteDatasetError – If failed to write dataset.
- class mindspore.mindrecord.ImageNetToMR(map_file, image_dir, destination, partition_number=1)[source]
Class is for transformation from imagenet to MindRecord.
- Parameters
map_file (str) –
the map file which indicate label. the map file content should like this:
n02119789 1 pen n02100735 2 notebook n02110185 3 mouse n02096294 4 orange
image_dir (str) – image directory contains n02119789, n02100735, n02110185, n02096294 dir.
destination (str) – the MindRecord file path to transform into.
partition_number (int, optional) – partition size (default=1).
- Raises
ValueError – If map_file, image_dir or destination is invalid.
- class mindspore.mindrecord.MindPage(file_name, num_consumer=4)[source]
Class to read MindRecord File series in pagination.
- Parameters
- Raises
ParamValueError – If file_name, num_consumer or columns is invalid.
MRMInitSegmentError – If failed to initialize ShardSegment.
- property candidate_fields
Return candidate category fields.
- Returns
list[str], by which data could be grouped.
- property category_field
Getter function for category field
- read_at_page_by_id(category_id, page, num_row)[source]
Query by category id in pagination.
- Parameters
- Returns
List, list[dict].
- Raises
ParamValueError – If any parameter is invalid.
MRMFetchDataError – If failed to fetch data by category.
MRMUnsupportedSchemaError – If schema is invalid.
- class mindspore.mindrecord.MnistToMR(source, destination, partition_number=1)[source]
Class is for transformation from Mnist to MindRecord.
- Parameters
- Raises
ValueError – If source/destination/partition_number is invalid.