Options
Kernel based clustering and vector quantization for speech recognition
Date Issued
01-12-2004
Author(s)
Satish, D. Srikrishna
Indian Institute of Technology, Madras
Abstract
In this paper we address the issues in construction of discrete hidden Markov models (HMMs) in the feature space of Mercer kernels. The kernel space HMMs are suitable for complex pattern recognition tasks that involve varying length patterns as in speech recognition. The main issues addressed are related to clustering and vector quantization in the kernel feature space for large data sets consisting of the data of multiple classes. Convergence of kernel based clustering method [1] is slow when the size of the data set is large. This limitation is overcome by clustering the data of each class separately. Computation of the measure of similarity between a data vector and the mean vector of a cluster in the feature space of an implicit mapping Mercer kernel involves evaluation of kernel function on the data vector and every member of the cluster. We propose a method to reduce the computational complexity of vector quantization in the kernel feature space. The proposed methods for clustering and vector quantization are used to build discrete HMMs in the kernel feature space for recognition of spoken utterances of letters in E-set of English alphabet. © 2004 IEEE.