Publication: Modeling syllable duration in Indian languages using support vector machines

Date
01-12-2005
Authors
Journal Title
Journal ISSN
Volume Title
Publisher
Abstract
In this paper we propose Support Vector Machines (SVM) for predicting the durations of the syllables in Indian languages. In this work SVM regression models are used for modeling the durations of the syllables and SVM classification models are used for categorizing the syllables based on duration. Analysis is performed on broadcast news data in the languages Hindi, Telugu and Tamil, in order to predict 'the duration of syllables in these languages using SVM regression model. The input to the SVM consists of a set of phonological, positional and contextual features extracted from the text. We also propose two-stage duration models for improving the prediction accuracy. From the studies it was found that about 86% of the syllable durations are predicted within 25% of the actual duration. The performance of the duration models are evaluated using objective measures such as mean absolute error (μ), standard deviation (σ) and correlation coefficient (γ). © 2005 IEEE.