View article

[PDF] from researchgate.net

Multi-source far-distance microphone selection and combination for automatic transcription of lectures

Authors

Matthias Wölfel, Christian Fügen, Shajith Ikbal, John W McDonough

Publication date

2006

Conference

Ninth International Conference on Spoken Language Processing

Description

In this work, we present our progress in multi-source far field automatic speech-to-text transcription for lecture speech. In particular, we show how the best of several far field channels can be selected based on a signal-to-noise ratio criterion, and how the signals from multiple channels can be combined at either the waveform level using blind channel combination or at the hypothesis level using confusion network techniques to improve the accuracy of a far field lecture transcription system. Using the techniques described here, we ran a series of experiments on the test set used by the US National Institute of Standards and Technologies for the RT-05S evaluation. For the multiple distant microphones (MDM) task of RT-05S, our system achieved a word error rate of 38.5% which represents an improvement of over 13% absolute compared to the best reported results in the RT-05S evaluation.

Total citations

Cited by 45

200620072008200920102011201220132014201520162017201820192 2 3 1 4 4 5 4 9 6 3 2

Scholar articles

Multi-source far-distance microphone selection and combination for automatic transcription of lectures

M Wölfel, C Fügen, S Ikbal, JW McDonough - Ninth International Conference on Spoken Language …, 2006