View article

Multiple cameras for audio-visual speech recognition in an automotive environment

Authors

Rajitha Navarathna, David Dean, Sridha Sridharan, Patrick Lucey

Publication date

2013/6/1

Journal

Computer Speech & Language

Volume

Issue

Pages

911-927

Publisher

Academic Press

Description

Audio-visual speech recognition, or the combination of visual lip-reading with traditional acoustic speech recognition, has been previously shown to provide a considerable improvement over acoustic-only approaches in noisy environments, such as that present in an automotive cabin. The research presented in this paper will extend upon the established audio-visual speech recognition literature to show that further improvements in speech recognition accuracy can be obtained when multiple frontal or near-frontal views of a speaker's face are available. A series of visual speech recognition experiments using a four-stream visual synchronous hidden Markov model (SHMM) are conducted on the four-camera AVICAR automotive audio-visual speech database. We study the relative contribution between the side and central orientated cameras in improving visual speech recognition accuracy. Finally combination of …

Total citations

Cited by 26

201320142015201620172018201920202021202220231 1 5 8 4 3 1 1 2

Scholar articles

Multiple cameras for audio-visual speech recognition in an automotive environment

R Navarathna, D Dean, S Sridharan, P Lucey - Computer Speech & Language, 2013