Authors
Pascal Hecker, Arpita M Kappattanavar, Maximilian Schmitt, Sidratul Moontaha, Johannes Wagner, Florian Eyben, Björn W Schuller, Bert Arnrich
Publication date
2022/12/12
Conference
2022 21st IEEE International Conference on Machine Learning and Applications (ICMLA)
Pages
337-344
Publisher
IEEE
Description
Cognitive load is frequently induced in laboratory setups to measure responses to stress, and its impact on voice has been studied in the field of computational paralinguistics. One dataset on this topic was provided in the Computational Paralinguistics Challenge (ComParE) 2014, and therefore offers great comparability. Recently, transformer-based deep learning architectures established a new state-of-the-art and are finding their way gradually into the audio domain. In this context, we investigate the performance of popular transformer architectures in the audio domain on the ComParE 2014 dataset, and the impact of different pre-training and fine-tuning setups on these models. Further, we recorded a small custom dataset, designed to be comparable with the ComParE 2014 one, to assess cross-corpus model generalisability. We find that the transformer models outperform the challenge baseline, the challenge …
Total citations
2023202411
Scholar articles
P Hecker, AM Kappattanavar, M Schmitt, S Moontaha… - 2022 21st IEEE International Conference on Machine …, 2022