Segui
Yiwen Shao
Titolo
Citata da
Citata da
Anno
Espresso: A fast end-to-end neural speech recognition toolkit
Y Wang, T Chen, H Xu, S Ding, H Lv, Y Shao, N Peng, L Xie, S Watanabe, ...
2019 IEEE Automatic Speech Recognition and Understanding Workshop (ASRU …, 2019
892019
Speaker diarization with region proposal network
Z Huang, S Watanabe, Y Fujita, P García, Y Shao, D Povey, S Khudanpur
ICASSP 2020-2020 IEEE International Conference on Acoustics, Speech and …, 2020
762020
PyChain: A Fully Parallelized PyTorch Implementation of LF-MMI for End-to-End ASR
Y Shao, Y Wang, D Povey, S Khudanpur
Proc. Interspeech 2020, 561-565, 2020
482020
Adversarial attacks and defenses for speech recognition systems
P Żelasko, S Joshi, Y Shao, J Villalba, J Trmal, N Dehak, S Khudanpur
arXiv preprint arXiv:2103.17122, 2021
292021
Using ASR methods for OCR
A Arora, CC Chang, B Rekabdar, B BabaAli, D Povey, D Etter, D Raj, ...
2019 International Conference on Document Analysis and Recognition (ICDAR …, 2019
252019
Multi-channel multi-speaker ASR using 3D spatial feature
Y Shao, SX Zhang, D Yu
ICASSP 2022-2022 IEEE International Conference on Acoustics, Speech and …, 2022
132022
Defense against adversarial attacks on hybrid speech recognition using joint adversarial fine-tuning with denoiser
S Joshi, S Kataria, Y Shao, P Zelasko, J Villalba, S Khudanpur, N Dehak
arXiv preprint arXiv:2204.03851, 2022
112022
Use of pitch continuity for robust speech activity detection
Y Shao, Q Lin
2018 IEEE International Conference on Acoustics, Speech and Signal …, 2018
112018
A Novel Normalization Method for Autocorrelation Function for Pitch Detection and for Speech Activity Detection.
Q Lin, Y Shao
Interspeech, 2097-2101, 2018
72018
Unix-encoder: A universal x-channel speech encoder for ad-hoc microphone array speech processing
Z Huang, Y Shao, SX Zhang, D Yu
ICASSP 2024-2024 IEEE International Conference on Acoustics, Speech and …, 2024
22024
RIR-SF: Room Impulse Response Based Spatial Feature for Multi-channel Multi-talker ASR
Y Shao, SX Zhang, D Yu
arXiv preprint arXiv:2311.00146, 2023
22023
Chunking Defense for Adversarial Attacks on ASR
Y Shao, J Villalba, S Joshi, S Kataria, S Khudanpur, N Dehak
Proc. Interspeech 2022, 2022
12022
Advancing Multi-talker ASR Performance with Large Language Models
M Shi, Z Jin, Y Xu, Y Xu, SX Zhang, K Wei, Y Shao, C Zhang, D Yu
arXiv preprint arXiv:2408.17431, 2024
2024
Multi-Channel Multi-Speaker ASR Using Target Speaker's Solo Segment
Y Shao, SX Zhang, Y Xu, M Yu, D Yu, D Povey, S Khudanpur
arXiv preprint arXiv:2406.09589, 2024
2024
Challenges and Insights: Exploring 3D Spatial Features and Complex Networks on the MISP Dataset
Y Shao
arXiv preprint arXiv:2310.03901, 2023
2023
RIR-SF: Room Impulse Response Based Spatial Feature for Target Speech Recognition in Multi-Channel Multi-Speaker Scenarios
Y Shao, SX Zhang, D Yu
Il sistema al momento non può eseguire l'operazione. Riprova più tardi.
Articoli 1–16