mindspore.dataset.audio.DetectPitchFrequency
- class mindspore.dataset.audio.DetectPitchFrequency(sample_rate, frame_time=0.01, win_length=30, freq_low=85, freq_high=3400)[source]
Detect pitch frequency.
It is implemented using normalized cross-correlation function and median smoothing.
- Parameters
sample_rate (int) – Sampling rate of the waveform, e.g.
44100
(Hz), the value can't be zero.frame_time (float, optional) – Duration of a frame, the value must be greater than zero. Default:
0.01
.win_length (int, optional) – The window length for median smoothing (in number of frames), the value must be greater than zero. Default:
30
.freq_low (int, optional) – Lowest frequency that can be detected (Hz), the value must be greater than zero. Default:
85
.freq_high (int, optional) – Highest frequency that can be detected (Hz), the value must be greater than zero. Default:
3400
.
- Raises
TypeError – If sample_rate is not of type int.
ValueError – If sample_rate is 0.
TypeError – If frame_time is not of type float.
ValueError – If frame_time is not positive.
TypeError – If win_length is not of type int.
ValueError – If win_length is not positive.
TypeError – If freq_low is not of type int.
ValueError – If freq_low is not positive.
TypeError – If freq_high is not of type int.
ValueError – If freq_high is not positive.
- Supported Platforms:
CPU
Examples
>>> import numpy as np >>> import mindspore.dataset as ds >>> import mindspore.dataset.audio as audio >>> >>> # Use the transform in dataset pipeline mode >>> waveform = np.random.random([5, 16]) # 5 samples >>> numpy_slices_dataset = ds.NumpySlicesDataset(data=waveform, column_names=["audio"]) >>> transforms = [audio.DetectPitchFrequency(30, 0.1, 3, 5, 25)] >>> numpy_slices_dataset = numpy_slices_dataset.map(operations=transforms, input_columns=["audio"]) >>> for item in numpy_slices_dataset.create_dict_iterator(num_epochs=1, output_numpy=True): ... print(item["audio"].shape, item["audio"].dtype) ... break (5,) float32 >>> >>> # Use the transform in eager mode >>> waveform = np.random.random([16]) # 1 sample >>> output = audio.DetectPitchFrequency(30, 0.1, 3, 5, 25)(waveform) >>> print(output.shape, output.dtype) (5,) float32
- Tutorial Examples: