Function mindspore::dataset::IMDB
Defined in File datasets.h
Function Documentation
A source dataset for reading and parsing IMDB dataset.
Note
The generated dataset has two columns [“text”, “label”].
- Parameters
dataset_dir – [in] Path to the root directory that contains the dataset.
usage – [in] The type of dataset. Acceptable usages include “train”, “test” or “all” (Default=”all”).
sampler – [in] Shared pointer to a sampler object used to choose samples from the dataset. If sampler is not given, a
RandomSampler
will be used to randomly iterate the entire dataset (default = RandomSampler()).cache – [in] Tensor cache to use (default=nullptr, which means no cache is used).
- Returns
Shared pointer to the IMDBDataset.
Example/* Define dataset path and MindData object */ std::string dataset_path = "/path/to/imdb_dataset_directory"; std::shared_ptr<Dataset> ds = IMDB(dataset_path, "all"); /* Create iterator to read dataset */ std::shared_ptr<Iterator> iter = ds->CreateIterator(); std::unordered_map<std::string, mindspore::MSTensor> row; iter->GetNextRow(&row); /* Note: In IMDB dataset, each data dictionary has keys "text" and "label" */ auto text = row["text"];