View article

[PDF] from epfl.ch

Robust speech recognition and feature extraction using HMM2

Authors

Katrin Weber, Shajith Ikbal, Samy Bengio, Hervé Bourlard

Publication date

2003/4/1

Journal

Computer Speech & Language

Volume

Issue

2-3

Pages

195-211

Publisher

Academic Press

Description

This paper presents the theoretical basis and preliminary experimental results of a new HMM model, referred to as HMM2, which can be considered as a mixture of HMMs. In this new model, the emission probabilities of the temporal (primary) HMM are estimated through secondary, state specific, HMMs working in the acoustic feature space. Thus, while the primary HMM is performing the usual time warping and integration, the secondary HMMs are responsible for extracting/modeling the possible feature dependencies, while performing frequency warping and integration. Such a model has several potential advantages, such as a more flexible modeling of the time/frequency structure of the speech signal. When working with spectral features, such a system can also perform nonlinear spectral warping, effectively implementing a form of nonlinear vocal tract normalization. Furthermore, it will be shown that HMM2 can …

Total citations

Cited by 41

200320042005200620072008200920102011201220132014201520162017201820192 6 7 2 4 6 3 2 3 1 1 1 1 1 1

Scholar articles

Robust speech recognition and feature extraction using HMM2

K Weber, S Ikbal, S Bengio, H Bourlard - Computer Speech & Language, 2003