mindspore.dataset.audio.SlidingWindowCmn
- class mindspore.dataset.audio.SlidingWindowCmn(cmn_window=600, min_cmn_window=100, center=False, norm_vars=False)[source]
- Apply sliding-window cepstral mean (and optionally variance) normalization per utterance. - Parameters
- cmn_window (int, optional) – Window in frames for running average CMN computation. Default: - 600.
- min_cmn_window (int, optional) – Minimum CMN window used at start of decoding (adds latency only at start). Only applicable if center is - False, ignored if center is- True. Default:- 100.
- center (bool, optional) – If - True, use a window centered on the current frame. If- False, window is to the left. Default:- False.
- norm_vars (bool, optional) – If - True, normalize variance to one. Default:- False.
 
- Raises
- TypeError – If cmn_window is not of type int. 
- ValueError – If cmn_window is a negative number. 
- TypeError – If min_cmn_window is not of type int. 
- ValueError – If min_cmn_window is a negative number. 
- TypeError – If center is not of type bool. 
- TypeError – If norm_vars is not of type bool. 
 
 - Supported Platforms:
- CPU
 - Examples - >>> import numpy as np >>> import mindspore.dataset as ds >>> import mindspore.dataset.audio as audio >>> >>> # Use the transform in dataset pipeline mode >>> waveform = np.random.random([5, 16, 3]) # 5 samples >>> numpy_slices_dataset = ds.NumpySlicesDataset(data=waveform, column_names=["audio"]) >>> transforms = [audio.SlidingWindowCmn()] >>> numpy_slices_dataset = numpy_slices_dataset.map(operations=transforms, input_columns=["audio"]) >>> for item in numpy_slices_dataset.create_dict_iterator(num_epochs=1, output_numpy=True): ... print(item["audio"].shape, item["audio"].dtype) ... break (16, 3) float64 >>> >>> # Use the transform in eager mode >>> waveform = np.random.random([16, 3]) # 1 sample >>> output = audio.SlidingWindowCmn()(waveform) >>> print(output.shape, output.dtype) (16, 3) float64 - Tutorial Examples: