Options
Non-negative subspace projection during conventional MFCC feature extraction for noise robust speech recognition
Date Issued
01-01-2013
Author(s)
Abstract
An additional feature processing algorithm using Non-negative Matrix Factorization (NMF) is proposed to be included during the conventional extraction of Mel-frequency cepstral coefficients (MFCC) for achieving noise robustness in HMM based speech recognition. The proposed approach reconstructs log-Mel filterbank outputs of speech data from a set of building blocks that form the bases of a speech subspace. The bases are learned using the standard NMF of training data. A variation of learning the bases is proposed, which uses histogram equalized activation coefficients during training, to achieve noise robustness. The proposed methods give up to 5.96% absolute improvement in recognition accuracy on Aurora-2 task over a baseline with standard MFCCs, and up to 13.69% improvement when combined with other feature normalization techniques like Histogram Equalization (HEQ) and Heteroscedastic Linear Discriminant Analysis (HLDA). © 2013 IEEE.