Authors
Katrin Weber, Shajith Ikbal, Samy Bengio, Hervé Bourlard
Publication date
2003/4/1
Journal
Computer Speech & Language
Volume
17
Issue
2-3
Pages
195-211
Publisher
Academic Press
Description
This paper presents the theoretical basis and preliminary experimental results of a new HMM model, referred to as HMM2, which can be considered as a mixture of HMMs. In this new model, the emission probabilities of the temporal (primary) HMM are estimated through secondary, state specific, HMMs working in the acoustic feature space. Thus, while the primary HMM is performing the usual time warping and integration, the secondary HMMs are responsible for extracting/modeling the possible feature dependencies, while performing frequency warping and integration. Such a model has several potential advantages, such as a more flexible modeling of the time/frequency structure of the speech signal. When working with spectral features, such a system can also perform nonlinear spectral warping, effectively implementing a form of nonlinear vocal tract normalization. Furthermore, it will be shown that HMM2 can …
Total citations
20032004200520062007200820092010201120122013201420152016201720182019267246323111111
Scholar articles
K Weber, S Ikbal, S Bengio, H Bourlard - Computer Speech & Language, 2003