View article

[PDF] from epfl.ch

Unsupervised speech/non-speech detection for automatic speech recognition in meeting rooms

Authors

Hari Krishna Maganti, Petr Motlicek, Daniel Gatica-Perez

Publication date

2007/4/15

Conference

2007 IEEE International Conference on Acoustics, Speech and Signal Processing-ICASSP'07

Volume

Pages

IV-1037-IV-1040

Publisher

IEEE

Description

The goal of this work is to provide robust and accurate speech detection for automatic speech recognition (ASR) in meeting room settings. The solution is based on computing long-term modulation spectrum, and examining specific frequency range for dominant speech components to classify speech and non-speech signals for a given audio signal. Manually segmented speech segments, short-term energy, short-term energy and zero-crossing based segmentation techniques, and a recently proposed multi layer perceptron (MLP) classifier system are tested for comparison purposes. Speech recognition evaluations of the segmentation methods are performed on a standard database and tested in conditions where the signal-to-noise ratio (SNR) varies considerably, as in the cases of close-talking headset, lapel, distant microphone array output, and distant microphone. The results reveal that the proposed method is …

Total citations

Cited by 43

200720082009201020112012201320142015201620172018201920202021202220231 6 3 5 4 4 5 4 2 1 1 4 1 1 1

Scholar articles

Unsupervised speech/non-speech detection for automatic speech recognition in meeting rooms

HK Maganti, P Motlicek, D Gatica-Perez - 2007 IEEE International Conference on Acoustics …, 2007