mindspore.dataset.audio

This module is to support audio augmentations. It includes two parts: audio transforms and utils. audio transforms is a high performance processing module with common audio operations. utils provides some general methods for audio processing.

Common imported modules in corresponding API examples are as follows:

import mindspore.dataset as ds
import mindspore.dataset.audio as audio
from mindspore.dataset.audio import utils

Alternative and equivalent imported audio module is as follows:

import mindspore.dataset.audio.transforms as audio

Descriptions of common data processing terms are as follows:

TensorOperation, the base class of all data processing operations implemented in C++.
AudioTensorOperation, the base class of all audio processing operations. It is a derived class of TensorOperation.

Transforms

`mindspore.dataset.audio.AllpassBiquad`	Design two-pole all-pass filter with central frequency and bandwidth for audio waveform.
`mindspore.dataset.audio.AmplitudeToDB`	Turn the input audio waveform from the amplitude/power scale to decibel scale.
`mindspore.dataset.audio.Angle`	Calculate the angle of complex number sequence.
`mindspore.dataset.audio.BandBiquad`	Design two-pole band-pass filter for audio waveform.
`mindspore.dataset.audio.BandpassBiquad`	Design two-pole Butterworth band-pass filter for audio waveform.
`mindspore.dataset.audio.BandrejectBiquad`	Design two-pole Butterworth band-reject filter for audio waveform.
`mindspore.dataset.audio.BassBiquad`	Design a bass tone-control effect, also known as two-pole low-shelf filter for audio waveform.
`mindspore.dataset.audio.Biquad`	Perform a biquad filter of input audio.
`mindspore.dataset.audio.ComplexNorm`	Compute the norm of complex number sequence.
`mindspore.dataset.audio.ComputeDeltas`	Compute delta coefficients of a spectrogram.
`mindspore.dataset.audio.Contrast`	Apply contrast effect for audio waveform.
`mindspore.dataset.audio.DBToAmplitude`	Turn a waveform from the decibel scale to the power/amplitude scale.
`mindspore.dataset.audio.DCShift`	Apply a DC shift to the audio.
`mindspore.dataset.audio.DeemphBiquad`	Design two-pole deemph filter for audio waveform of dimension of (..., time).
`mindspore.dataset.audio.DetectPitchFrequency`	Detect pitch frequency.
`mindspore.dataset.audio.Dither`	Dither increases the perceived dynamic range of audio stored at a particular bit-depth by eliminating nonlinear truncation distortion.
`mindspore.dataset.audio.EqualizerBiquad`	Design biquad equalizer filter and perform filtering.
`mindspore.dataset.audio.Fade`	Add a fade in and/or fade out to an waveform.
`mindspore.dataset.audio.Flanger`	Apply a flanger effect to the audio.
`mindspore.dataset.audio.FrequencyMasking`	Apply masking to a spectrogram in the frequency domain.
`mindspore.dataset.audio.Gain`	Apply amplification or attenuation to the whole waveform.
`mindspore.dataset.audio.GriffinLim`	Approximate magnitude spectrogram inversion using the GriffinLim algorithm.
`mindspore.dataset.audio.HighpassBiquad`	Design biquad highpass filter and perform filtering.
`mindspore.dataset.audio.InverseMelScale`	Solve for a normal STFT form a mel frequency STFT, using a conversion matrix.
`mindspore.dataset.audio.LFilter`	Design two-pole filter for audio waveform of dimension of (..., time).
`mindspore.dataset.audio.LowpassBiquad`	Design two-pole low-pass filter for audio waveform.
`mindspore.dataset.audio.Magphase`	Separate a complex-valued spectrogram with shape (..., 2) into its magnitude and phase.
`mindspore.dataset.audio.MaskAlongAxis`	Apply a mask along axis.
`mindspore.dataset.audio.MaskAlongAxisIID`	Apply a mask along axis.
`mindspore.dataset.audio.MelScale`	Convert normal STFT to STFT at the Mel scale.
`mindspore.dataset.audio.MuLawDecoding`	Decode mu-law encoded signal.
`mindspore.dataset.audio.MuLawEncoding`	Encode signal based on mu-law companding.
`mindspore.dataset.audio.Overdrive`	Apply overdrive on input audio.
`mindspore.dataset.audio.Phaser`	Apply a phasing effect to the audio.
`mindspore.dataset.audio.PhaseVocoder`	Given a STFT tensor, speed up in time without modifying pitch by a factor of rate.
`mindspore.dataset.audio.Resample`	Resample a signal from one frequency to another.
`mindspore.dataset.audio.RiaaBiquad`	Apply RIAA vinyl playback equalization.
`mindspore.dataset.audio.SlidingWindowCmn`	Apply sliding-window cepstral mean (and optionally variance) normalization per utterance.
`mindspore.dataset.audio.SpectralCentroid`	Create a spectral centroid from an audio signal.
`mindspore.dataset.audio.Spectrogram`	Create a spectrogram from an audio signal.
`mindspore.dataset.audio.TimeMasking`	Apply masking to a spectrogram in the time domain.
`mindspore.dataset.audio.TimeStretch`	Stretch Short Time Fourier Transform (STFT) in time without modifying pitch for a given rate.
`mindspore.dataset.audio.TrebleBiquad`	Design a treble tone-control effect.
`mindspore.dataset.audio.Vad`	Attempt to trim silent background sounds from the end of the voice recording.
`mindspore.dataset.audio.Vol`	Apply amplification or attenuation to the whole waveform.

Utilities

`mindspore.dataset.audio.BorderType`	Padding Mode, BorderType Type.
`mindspore.dataset.audio.DensityFunction`	Density Functions.
`mindspore.dataset.audio.FadeShape`	Fade Shapes.
`mindspore.dataset.audio.GainType`	Gain Types.
`mindspore.dataset.audio.Interpolation`	Interpolation Type.
`mindspore.dataset.audio.MelType`	Mel Types.
`mindspore.dataset.audio.Modulation`	Modulation Type.
`mindspore.dataset.audio.NormMode`	Norm Types.
`mindspore.dataset.audio.NormType`	Norm Types.
`mindspore.dataset.audio.ResampleMethod`	Resample method
`mindspore.dataset.audio.ScaleType`	Scale Types.
`mindspore.dataset.audio.WindowType`	Window Function types,
`mindspore.dataset.audio.create_dct`	Create a DCT transformation matrix with shape (n_mels, n_mfcc), normalized depending on norm.
`mindspore.dataset.audio.melscale_fbanks`	Create a frequency transformation matrix with shape (n_freqs, n_mels).