Function mindspore::dataset::MindData
Defined in File datasets.h
Function Documentation
Function to create a MindDataDataset.
- Parameters
dataset_files – [in] List of dataset files to be read directly.
columns_list – [in] List of columns to be read (default={}).
sampler – [in] Shared pointer to a sampler object used to choose samples from the dataset. If sampler is not given, a
RandomSampler
will be used to randomly iterate the entire dataset (default = RandomSampler()), supported sampler list: SubsetRandomSampler, PkSampler, RandomSampler, SequentialSampler, DistributedSampler.padded_sample – [in] Samples will be appended to dataset, where keys are the same as column_list.
num_padded – [in] Number of padding samples. Dataset size plus num_padded should be divisible by num_shards.
shuffle_mode – [in] The mode for shuffling data every epoch (Default=ShuffleMode::kGlobal). Can be any of: ShuffleMode::kFalse - No shuffling is performed. ShuffleMode::kFiles - Shuffle files only. ShuffleMode::kGlobal - Shuffle both the files and samples. ShuffleMode::kInfile - Shuffle samples in file.
cache – [in] Tensor cache to use (default=nullptr which means no cache is used).
- Returns
Shared pointer to the MindDataDataset.
Example/* Define dataset path and MindData object */ std::string file_path1 = "/path/to/mindrecord_file1"; std::string file_path2 = "/path/to/mindrecord_file2"; std::vector<std::string> file_list = {file_path1, file_path2}; std::vector<std::string> column_names = {"data", "file_name", "label"}; std::shared_ptr<Dataset> ds = MindData(file_list, column_names); /* Create iterator to read dataset */ std::shared_ptr<Iterator> iter = ds->CreateIterator(); std::unordered_map<std::string, mindspore::MSTensor> row; iter->GetNextRow(&row); /* Note: As we defined before, each data dictionary owns keys "data", "file_name" and "label" */ auto data = row["data"];