Authors
Pavol Harar, Jesus B Alonso-Hernandezy, Jiri Mekyska, Zoltan Galaz, Radim Burget, Zdenek Smekal
Publication date
2017/7/10
Conference
2017 international conference and workshop on bioinspired intelligence (IWOBI)
Pages
1-4
Publisher
IEEE
Description
This paper describes a preliminary investigation of Voice Pathology Detection using Deep Neural Networks (DNN). We used voice recordings of sustained vowel /a/ produced at normal pitch from German corpus Saarbruecken Voice Database (SVD). This corpus contains voice recordings and electroglottograph signals of more than 2 000 speakers. The idea behind this experiment is the use of convolutional layers in combination with recurrent Long-Short-Term-Memory (LSTM) layers on raw audio signal. Each recording was split into 64 ms Hamming windowed segments with 30 ms overlap. Our trained model achieved 71.36% accuracy with 65.04% sensitivity and 77.67% specificity on 206 validation files and 68.08% accuracy with 66.75% sensitivity and 77.89% specificity on 874 testing files. This is a promising result in favor of this approach because it is comparable to similar previously published experiment that …
Total citations
20182019202020212022202320241291819342315
Scholar articles
P Harar, JB Alonso-Hernandezy, J Mekyska, Z Galaz… - 2017 international conference and workshop on …, 2017