mindspore.dataset.audio.Vad
- class mindspore.dataset.audio.Vad(sample_rate, trigger_level=7.0, trigger_time=0.25, search_time=1.0, allowed_gap=0.25, pre_trigger_time=0.0, boot_time=0.35, noise_up_time=0.1, noise_down_time=0.01, noise_reduction_amount=1.35, measure_freq=20.0, measure_duration=None, measure_smooth_time=0.4, hp_filter_freq=50.0, lp_filter_freq=6000.0, hp_lifter_freq=150.0, lp_lifter_freq=2000.0)[source]
- Voice activity detector. - Attempt to trim silence and quiet background sounds from the ends of recordings of speech. - Similar to SoX implementation. - Parameters
- sample_rate (int) – Sampling rate of audio signal. 
- trigger_level (float, optional) – The measurement level used to trigger activity detection. Default: - 7.0.
- trigger_time (float, optional) – The time constant (in seconds) used to help ignore short bursts of sounds. Default: - 0.25.
- search_time (float, optional) – The amount of audio (in seconds) to search for quieter/shorter bursts of audio to include prior to the detected trigger point. Default: - 1.0.
- allowed_gap (float, optional) – The allowed gap (in seconds) between quieter/shorter bursts of audio to include prior to the detected trigger point. Default: - 0.25.
- pre_trigger_time (float, optional) – The amount of audio (in seconds) to preserve before the trigger point and any found quieter/shorter bursts. Default: - 0.0.
- boot_time (float, optional) – The time for the initial noise estimate. Default: - 0.35.
- noise_up_time (float, optional) – Time constant used by the adaptive noise estimator for when the noise level is increasing. Default: - 0.1.
- noise_down_time (float, optional) – Time constant used by the adaptive noise estimator for when the noise level is decreasing. Default: - 0.01.
- noise_reduction_amount (float, optional) – Amount of noise reduction to use in the detection algorithm. Default: 1.35. 
- measure_freq (float, optional) – Frequency of the algorithm's processing/measurements. Default: - 20.0.
- measure_duration (float, optional) – The duration of measurement. Default: - None, will use twice the measurement period.
- measure_smooth_time (float, optional) – Time constant used to smooth spectral measurements. Default: - 0.4.
- hp_filter_freq (float, optional) – The 'Brick-wall' frequency of high-pass filter applied at the input to the detector algorithm. Default: - 50.0.
- lp_filter_freq (float, optional) – The 'Brick-wall' frequency of low-pass filter applied at the input to the detector algorithm. Default: - 6000.0.
- hp_lifter_freq (float, optional) – The 'Brick-wall' frequency of high-pass lifter used in the detector algorithm. Default: - 150.0.
- lp_lifter_freq (float, optional) – The 'Brick-wall' frequency of low-pass lifter used in the detector algorithm. Default: - 2000.0.
 
- Raises
- TypeError – If sample_rate is not of type int. 
- ValueError – If sample_rate is not a positive number. 
- TypeError – If trigger_level is not of type float. 
- TypeError – If trigger_time is not of type float. 
- ValueError – If trigger_time is a negative number. 
- TypeError – If search_time is not of type float. 
- ValueError – If search_time is a negative number. 
- TypeError – If allowed_gap is not of type float. 
- ValueError – If allowed_gap is a negative number. 
- TypeError – If pre_trigger_time is not of type float. 
- ValueError – If pre_trigger_time is a negative number. 
- TypeError – If boot_time is not of type float. 
- ValueError – If boot_time is a negative number. 
- TypeError – If noise_up_time is not of type float. 
- ValueError – If noise_up_time is a negative number. 
- TypeError – If noise_down_time is not of type float. 
- ValueError – If noise_down_time is a negative number. 
- ValueError – If noise_up_time is less than noise_down_time . 
- TypeError – If noise_reduction_amount is not of type float. 
- ValueError – If noise_reduction_amount is a negative number. 
- TypeError – If measure_freq is not of type float. 
- ValueError – If measure_freq is not a positive number. 
- TypeError – If measure_duration is not of type float. 
- ValueError – If measure_duration is a negative number. 
- TypeError – If measure_smooth_time is not of type float. 
- ValueError – If measure_smooth_time is a negative number. 
- TypeError – If hp_filter_freq is not of type float. 
- ValueError – If hp_filter_freq is not a positive number. 
- TypeError – If lp_filter_freq is not of type float. 
- ValueError – If lp_filter_freq is not a positive number. 
- TypeError – If hp_lifter_freq is not of type float. 
- ValueError – If hp_lifter_freq is not a positive number. 
- TypeError – If lp_lifter_freq is not of type float. 
- ValueError – If lp_lifter_freq is not a positive number. 
- RuntimeError – If input tensor is not in shape of <…, time>. 
 
 - Supported Platforms:
- CPU
 - Examples - >>> import numpy as np >>> import mindspore.dataset as ds >>> import mindspore.dataset.audio as audio >>> >>> # Use the transform in dataset pipeline mode >>> waveform = np.random.random([5, 1000]) # 5 samples >>> numpy_slices_dataset = ds.NumpySlicesDataset(data=waveform, column_names=["audio"]) >>> transforms = [audio.Vad(sample_rate=600)] >>> numpy_slices_dataset = numpy_slices_dataset.map(operations=transforms, input_columns=["audio"]) >>> for item in numpy_slices_dataset.create_dict_iterator(num_epochs=1, output_numpy=True): ... print(item["audio"].shape, item["audio"].dtype) ... break (660,) float64 >>> >>> # Use the transform in eager mode >>> waveform = np.random.random([1000]) # 1 sample >>> output = audio.Vad(sample_rate=600)(waveform) >>> print(output.shape, output.dtype) (660,) float64 - Tutorial Examples: