Options
DNN acoustic models for dysarthric speech
Date Issued
19-10-2017
Author(s)
Tejaswi, Seeram
Indian Institute of Technology, Madras
Abstract
In this paper, we investigate various training methods for building deep neural network (DNN) based acoustic models for dysarthric speech data. Methods like multitask learning, knowledge distillation and model adaptation, which overcome data sparsity and model over-fitting problems are employed to study the merits of each method. In Knowledge distillation framework, some privilege information in addition to featurelabels pairs available only during training, is exploited to help the model learn better without using such previleged information during testing [1]; knowledge from one model can be distilled to another and thereby guiding it in learning better [2]. In this work, a DNN acoustic model trained using data pooled from Dysarthric speech data and parallel un-impaired data is used as the intelligent teacher while the student DNN model is trained using only Dysarthric speech. The target label for training the student model is a combination of hard aligned labels and those obtained from forward-pass through the teacher model. In addition to this technique, other knowledge sharing techniques like multitask learning were explored for Dysarthric speech data and have found to show a relative improvement of 11% over the corresponding baseline models.