Authors
Rajitha Navarathna, David Dean, Sridha Sridharan, Patrick Lucey
Publication date
2013/6/1
Journal
Computer Speech & Language
Volume
27
Issue
4
Pages
911-927
Publisher
Academic Press
Description
Audio-visual speech recognition, or the combination of visual lip-reading with traditional acoustic speech recognition, has been previously shown to provide a considerable improvement over acoustic-only approaches in noisy environments, such as that present in an automotive cabin. The research presented in this paper will extend upon the established audio-visual speech recognition literature to show that further improvements in speech recognition accuracy can be obtained when multiple frontal or near-frontal views of a speaker's face are available. A series of visual speech recognition experiments using a four-stream visual synchronous hidden Markov model (SHMM) are conducted on the four-camera AVICAR automotive audio-visual speech database. We study the relative contribution between the side and central orientated cameras in improving visual speech recognition accuracy. Finally combination of …
Total citations
20132014201520162017201820192020202120222023115843112
Scholar articles