Options
The modified group delay function and its application to phoneme recognition
Date Issued
25-09-2003
Author(s)
Indian Institute of Technology, Madras
Gadde, Venkata
Abstract
We explore a new spectral representation of speech signals through group delay functions. The group delay functions by themselves are noisy and difficult to interpret owing to zeroes that are close to the unit circle in the z-domain and these clutter the spectra. A new modified group delay function that reduces the effects of zeroes close to the unit circle is used. Assuming that this new function is minimum phase, the modified group delay spectrum is converted to a sequence of cepstral coefficients. A preliminary phoneme recogniser is built using features derived from these cepstra. Results are compared with those obtained from features derived from the traditional mel frequency cepstral coefficients (MFCC). The baseline MFCC performance is 34.7%, while that of the best modified group delay cepstrum is 39.2%. The performance of the composite MFCC feature, which includes the derivatives and double derivatives, is 60.7%, while that of the composite modified group delay feature is 57.3%. When these two composite features are combined, ≈ 2% improvement in performance is achieved (62.8%). When this new system is combined with linear frequency cepstra (LFC) [2], the system performance results in another ≈ 0.8% improvement (63.6%).
Volume
1