Function mindspore::dataset::LibriTTS

Function Documentation

inline std::shared_ptr<LibriTTSDataset> mindspore::dataset::LibriTTS(const std::string &dataset_dir, const std::string &usage = "all", const std::shared_ptr<Sampler> &sampler = std::make_shared<RandomSampler>(), const std::shared_ptr<DatasetCache> &cache = nullptr)

Function to create a LibriTTSDataset.

Note

The generated dataset has seven columns [‘waveform’, ‘sample_rate’, ‘original_text’, ‘normalized_text’, ‘speaker_id’, ‘chapter_id’, ‘utterance_id’].

Parameters
  • dataset_dir[in] Path to the root directory that contains the dataset.

  • usage[in] Part of dataset of LibriTTS, can be “dev-clean”, “dev-other”, “test-clean”, “test-other”, “train-clean-100”, “train-clean-360”, “train-other-500”, or “all” (default = “all”).

  • sampler[in] Shared pointer to a sampler object used to choose samples from the dataset. If sampler is not given, a RandomSampler will be used to randomly iterate the entire dataset (default = RandomSampler()).

  • cache[in] Tensor cache to use (default=nullptr, which means no cache is used).

Returns

Shared pointer to the LibriTTSDataset.

Example
        /* Define dataset path and LibriTTS object */
        std::string folder_path = "/path/to/libri_tts_dataset_directory";
        std::shared_ptr<Dataset> ds = LibriTTS(folder_path);
  
        /* Create iterator to read dataset */
        std::shared_ptr<Iterator> iter = ds->CreateIterator();
        std::unordered_map<std::string, mindspore::MSTensor> row;
        iter->GetNextRow(&row);
  
        /* Note: In LibriTTS dataset, each data dictionary has seven columns ["waveform", "sample_rate",
*          "original_text", "normalized_text", "speaker_id", "chapter_id", "utterance_id"].*/
        auto waveform = row["waveform"];