Vol. 9, No. 1 (December, 2009)

Method for estimating pitch independently from power spectrum envelope for speech and musical signal
Yoshifumi Hara, Mitsuo Matsumoto and Kazunori Miyoshi

Pitch and changes in pitch are primary characteristics of a speech signal. Since a speech signal is a quasi-periodic signal, stability and accuracy are required to a pitch estimation method. Various methods for extracting periods of a speech signal in the time domain and for analyzing the microstructure of the spectrum in the frequency domain have been proposed. Auto-correlation function (ACF) and its applications are well known methods to be applied to detect periodicity of a speech signal in the time domain and are known to be robust to noise. ACF in the time domain is equivalent to the power spectrum in the frequency domain. Therefore, pitch estimated by ACF is subject to the power spectrum of the speech signal. This paper proposes a method for applying ACF to detect periodicity of the microstructure of the spectrum in the frequency domain, independently from the power spectrum envelope. First, divide a speech signal into a set of frames. Second, in each frame, picking up major local peaks of the
amplitude frequency characteristics for a speech signal in the frame in the frequency domain. Third, represent the amplitude frequency characteristics as a sequence of unity impulses, which is a line spectrum. Locations of the impulses on the frequency axis are those of the local peaks. Finally, apply ACF to the sequence for extracting periods of the impulses on the frequency axis. And estimate pitch with the periods. Since pitch estimated by this method is free from the power spectrum envelope of a speech signal, the method has stability and accuracy. Furthermore, in this method, because simplified ACF is applicable to a line spectrum, the method is advantageous for computing complexity

Key words: Pitch estimation, Autocorrelation, Power spectrum, Line spectrum, Peak-picking

Table of Contents | PDF