Function mindspore::dataset::Tedlium
Defined in File datasets.h
Function Documentation
Function to create a TedliumDataset.
Note
The generated dataset has six columns [“waveform”, “sample_rate”, “transcript”, “talk_id”, “speaker_id”, “identifier”].
- Parameters
dataset_dir – [in] Path to the root directory that contains the dataset.
release – [in] Release of the dataset, can be “release1”, “release2”, “release3”.
usage – [in] Part of dataset of TEDLIUM, for release3, only can be “all”, for release1 and release2, can be “train”, “test” or “all” (default = “all”).
extensions – [in] The extensions of audio file. Only support “.sph” now (default = “.sph”).
sampler – [in] Shared pointer to a sampler object used to choose samples from the dataset. If sampler is not given, a
RandomSampler
will be used to randomly iterate the entire dataset (default = RandomSampler()).cache – [in] Tensor cache to use (default=nullptr, which means no cache is used).
- Returns
Shared pointer to the TedliumDataset.