Implementing fusion techniques for the classification of paralinguistic information

Vlasenko, Bogdan; Sebastian, Jilt; Pavan Kumar, D. S.; Magimai-Doss, Mathew

doi:10.21437/Interspeech.2018-2360

Implementing fusion techniques for the classification of paralinguistic information

Date Issued

01-01-2018

Author(s)

Vlasenko, Bogdan

Sebastian, Jilt

Pavan Kumar, D. S.

Magimai-Doss, Mathew

DOI

10.21437/Interspeech.2018-2360

Abstract

This work tests several classification techniques and acoustic features and further combines them using late fusion to classify paralinguistic information for the ComParE 2018 challenge. We use Multiple Linear Regression (MLR) with Ordinary Least Squares (OLS) analysis to select the most informative features for Self-Assessed Affect (SSA) sub-Challenge. We also propose to use raw-waveform convolutional neural networks (CNN) in the context of three paralinguistic sub-challenges. By using combined evaluation split for estimating codebook, we obtain better representation for Bag-of-Audio-Words approach. We preprocess the speech to vocalized segments to improve classification performance. For fusion of our leading classification techniques, we use weighted late fusion approach applied for confidence scores. We use two mismatched evaluation phases by exchanging the training and development sets, and this estimates the optimal fusion weight. Weighted late fusion provides better performance on development sets in comparison with baseline techniques. Raw-waveform techniques perform comparable to the baseline.

Volume

2018-September

Subjects

Options

Implementing fusion techniques for the classification of paralinguistic information