Подписаться
Tao Li
Tao Li
Audio, Speech and Language Processing Group (ASLP@NPU), School of Computer Science
Подтвержден адрес электронной почты в домене npu-aslp.org
Название
Процитировано
Процитировано
Год
Controllable emotion transfer for end-to-end speech synthesis
T Li, S Yang, L Xue, L Xie
2021 12th International Symposium on Chinese Spoken Language Processing …, 2021
852021
Cross-speaker emotion disentangling and transfer for end-to-end speech synthesis
T Li, X Wang, Q Xie, Z Wang, L Xie
IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1448-1460, 2022
41*2022
Enriching source style transfer in recognition-synthesis based non-parallel voice conversion
Z Wang, X Zhou, F Yang, T Li, H Du, L Xie, W Gan, H Chen, H Li
arXiv preprint arXiv:2106.08741, 2021
192021
One-shot voice conversion for style transfer based on speaker adaptation
Z Wang, Q Xie, T Li, H Du, L Xie, P Zhu, M Bi
ICASSP 2022-2022 IEEE International Conference on Acoustics, Speech and …, 2022
152022
Cross-speaker Emotion Transfer Based On Prosody Compensation for End-to-End Speech Synthesis
T Li, X Wang, Q Xie, Z Wang, M Jiang, L Xie
arXiv preprint arXiv:2207.01198, 2022
132022
Multi-speaker expressive speech synthesis via multiple factors decoupling
X Zhu, Y Lei, K Song, Y Zhang, T Li, L Xie
ICASSP 2023-2023 IEEE International Conference on Acoustics, Speech and …, 2023
112023
Multi-speaker multi-style text-to-speech synthesis with single-speaker single-style training data scenarios
Q Xie, T Li, X Wang, Z Wang, L Xie, G Yu, G Wan
2022 13th International Symposium on Chinese Spoken Language Processing …, 2022
112022
Metts: Multilingual emotional text-to-speech by cross-speaker and cross-lingual emotion transfer
X Zhu, Y Lei, T Li, Y Zhang, H Zhou, H Lu, L Xie
IEEE/ACM Transactions on Audio, Speech, and Language Processing, 2024
82024
DiCLET-TTS: Diffusion model based cross-lingual emotion transfer for text-to-speech—A study between English and Mandarin
T Li, C Hu, J Cong, X Zhu, J Li, Q Tian, Y Wang, L Xie
IEEE/ACM Transactions on Audio, Speech, and Language Processing, 2023
82023
Vec-tok speech: Speech vectorization and tokenization for neural speech generation
X Zhu, Y Lv, Y Lei, T Li, W He, H Zhou, H Lu, L Xie
arXiv preprint arXiv:2310.07246, 2023
62023
MM-TTS: Multi-Modal Prompt Based Style Transfer for Expressive Text-to-Speech Synthesis
W Guan, Y Li, T Li, H Huang, F Wang, J Lin, L Huang, L Li, Q Hong
Proceedings of the AAAI Conference on Artificial Intelligence 38 (16), 18117 …, 2024
52024
MSM-VC: high-fidelity source style transfer for non-parallel voice conversion by multi-scale style modeling
Z Wang, X Wang, Q Xie, T Li, L Xie, Q Tian, Y Wang
IEEE/ACM Transactions on Audio, Speech, and Language Processing, 2023
32023
HIGNN-TTS: Hierarchical Prosody Modeling With Graph Neural Networks for Expressive Long-Form TTS
D Guo, X Zhu, L Xue, T Li, Y Lv, Y Jiang, L Xie
2023 IEEE Automatic Speech Recognition and Understanding Workshop (ASRU), 1-7, 2023
22023
Improving Multi-Speaker ASR With Overlap-Aware Encoding And Monotonic Attention
T Li, F Wang, W Guan, L Huang, Q Hong, L Li
ICASSP 2024-2024 IEEE International Conference on Acoustics, Speech and …, 2024
12024
CASA-Net: Cross-attention and Self-attention for End-to-End Audio-visual Speaker Diarization
H Zhou, T Li, J Wang, L Li, Q Hong
2023 Asia Pacific Signal and Information Processing Association Annual …, 2023
12023
Conformer-based Language Embedding with Self-Knowledge Distillation for Spoken Language Identification
F Wang, L Huang, T Li, Q Hong, L Li
Proceedings of the Interspeech, 5286-5290, 2023
12023
U-Style: Cascading U-nets with Multi-level Speaker and Style Modeling for Zero-Shot Voice Cloning
T Li, Z Wang, X Zhu, J Cong, Q Tian, Y Wang, L Xie
IEEE/ACM Transactions on Audio, Speech, and Language Processing, 2024
2024
Towards Expressive Zero-Shot Speech Synthesis with Hierarchical Prosody Modeling
Y Jiang, T Li, F Yang, L Xie, M Meng, Y Wang
arXiv preprint arXiv:2406.05681, 2024
2024
The XMUSpeech system for audio-visual target speaker extraction in MISP 2023 challenge
L Luo, T Li, L Li, Q Hong
2024 IEEE International Conference on Acoustics, Speech, and Signal …, 2024
2024
A Pipelined Framework with Serialized Output Training for Overlapping Speech Recognition
T Li, L Huang, F Wang, S Li, Q Hong, L Li
National Conference on Man-Machine Speech Communication, 114-123, 2022
2022
В данный момент система не может выполнить эту операцию. Повторите попытку позднее.
Статьи 1–20