Repository logo
  • English
  • Català
  • Čeština
  • Deutsch
  • Español
  • Français
  • Gàidhlig
  • Italiano
  • Latviešu
  • Magyar
  • Nederlands
  • Polski
  • Português
  • Português do Brasil
  • Suomi
  • Svenska
  • Türkçe
  • Қазақ
  • বাংলা
  • हिंदी
  • Ελληνικά
  • Yкраї́нська
  • Log In
    or
    New user? Click here to register.Have you forgotten your password?
Repository logo
  • Communities & Collections
  • Research Outputs
  • Fundings & Projects
  • People
  • Statistics
  • English
  • Català
  • Čeština
  • Deutsch
  • Español
  • Français
  • Gàidhlig
  • Italiano
  • Latviešu
  • Magyar
  • Nederlands
  • Polski
  • Português
  • Português do Brasil
  • Suomi
  • Svenska
  • Türkçe
  • Қазақ
  • বাংলা
  • हिंदी
  • Ελληνικά
  • Yкраї́нська
  • Log In
    or
    New user? Click here to register.Have you forgotten your password?
  1. Home
  2. Indian Institute of Technology Madras
  3. Publication10
  4. Continuous speech recognition using joint features derived from the modified group delay function and MFCC
 
  • Details
Options

Continuous speech recognition using joint features derived from the modified group delay function and MFCC

Date Issued
01-01-2004
Author(s)
Hegde, Rajesh M.
Hema A Murthy 
Indian Institute of Technology, Madras
Rao, Gadde V.Ramana
Abstract
Feature extraction and selection for continuous speech recognition is a complex task. State of the art speech recognition systems use features that are derived by ignoring the Fourier transform phase. In our earlier studies we have shown the efficacy of The Modified Group Delay Feature (MODGDF) derived from the Fourier transform phase for phoneme, syllable and speaker recognition. In this paper we use the MOD-GDF and the popular MFCC derived from Fourier transform magnitude to compute joint features for continuous speech recognition of two Indian languages Tamil and Telugu. A novel method of segmentation of the continuous speech signal into syllable like units followed by isolated style recognition using HMMs is used. We further use an innovative technique which transforms the problem of detecting the correct string of syllabic units with maximum likelihood to finding an optimal state sequence locally. The recognition system does not use any language models. The MODGDF gave promising recognition performance for the two languages and compared well with the MFCC. Joint features derived using MODGDF and MFCC gave a 10.6% improvement for both Tamil and Telugu languages. The improvement reinforces the hypothesis that MODGDF captures complementary information to that of the MFCC and can be used along with the MFCC to capture the complete information in the speech signal at functional level and help in avoiding heavy auditory and language models.
Indian Institute of Technology Madras Knowledge Repository developed and maintained by the Library

Built with DSpace-CRIS software - Extension maintained and optimized by 4Science

  • Cookie settings
  • Privacy policy
  • End User Agreement
  • Send Feedback