Authors
Miranti Indar Mandasari, Rahim Saeidi, David A van Leeuwen
Publication date
2012
Publisher
Santander, Spain: sn
Description
The production of speech is not only influenced by various intrinsic factors such as semantics, dialect, human perspective and emotion, but also by extrinsic factors such as environmental conditions and transmission channel. In certain acoustic conditions, the vocal effort of a speaker tends to be raised in order to overcome environmental hindrances such as a presence of noise or a long distance between the speaker and listener. There have only been a few studies on speaker recognition under non-neutral speech production conditions (ie, high or low vocal effort and speech under stress)(Hansen, 2011). However, in real forensic cases, it can occur that the incriminating recording is made with high vocal effort, which then has to be dealt with in speaker comparison.
This paper presents a study of the effect of high vocal effort speech to the automatic speaker recognition system performance, considering the likelihood ratio (LR) calibration aspect. Using the most recent algorithm in the field (Burget, 2011 and Dehak, 2011), the calibration performance of the system is evaluated on both high and normal vocal effort conditions of the latest NIST speaker recognition evaluation (SRE)(http://www. nist. gov/itl/iad/mig/sre. cfm).
Total citations