View article

[PDF] from leaschoenherr.me

Spoofing detection via simultaneous verification of audio-visual synchronicity and transcription

Authors

Lea Schönherr, Steffen Zeiler, Dorothea Kolossa

Publication date

2017/12/16

Conference

IEEE Automatic Speech Recognition and Understanding Workshop (ASRU)

Publisher

IEEE

Description

Acoustic speaker recognition systems are very vulnerable to spoofing attacks via replayed or synthesized utterances. One possible countermeasure is audio-visual speaker recognition. Nevertheless, the addition of the visual stream alone does not prevent spoofing attacks completely and only provides further information to assess the authenticity of the utterance. Many systems consider audio and video modalities independently and can easily be spoofed by imitating only a single modality or by a bimodal replay attack with a victim's photograph or video. Therefore, we propose the simultaneous verification of the data synchronicity and the transcription in a challenge-response setup. We use coupled hidden Markov models (CHMMs) for a text-dependent spoofing detection and introduce new features that provide information about the transcriptions of the utterance and the synchronicity of both streams. We evaluate …

Total citations

Cited by 1

20211

Scholar articles

Spoofing detection via simultaneous verification of audio-visual synchronicity and transcription

L Schonherr, S Zeiler, D Kolossa - 2017 IEEE Automatic Speech Recognition and …, 2017