sanyuan-chen,Sanyuan Chen (陈三元),github

Hi, I'm Sanyuan Chen 👋

🎓 I’m currently a Ph.D. student at Harbin Institute of Technology and a research intern in Microsoft Research Asia.
🌱 My research interests include self-supervised learning, speech and audio processing and spoken language processing.
📄 My research highlights:
- [Nov 2023] VALL-E produced the AI Audiobook of Impromptu: Amplifying Our Humanity Through AI with an “AI Reid” voice.
- [Apr 2023] VALL-E wins the UNESCO Netexplo Innovation Award 2023 (top 10 out of over 3000 innovations of the year).
- [Apr 2023] BEATs is accepted by ICML 2023 as an oral paper.
- [Mar 2023] VALL-E X, a cross-lingual version of VALL-E that can help anyone speak a foreign language in their own voice without an accent. See https://aka.ms/vallex for demos.
- [Jan 2023] VALL-E, a language modeling approach for text to speech synthesis, achieves state-of-the-art zero-shot TTS performance and emerges in-context learning capabilities. See https://aka.ms/valle for demos.
- [Dec 2022] BEATs, a discrete label prediction based audio pre-training framework, ranks 1st in the AudioSet, Balanced AudioSet and ESC-50 leaderboards. We released the codes and pre-trained models.
- [Nov 2022] WavLM is now available on TorchAudio. Try to use it here.
- [Sep 2022] SpeechLM, a textual enhanced speech pre-training model, achieves 16% relative WER reduction over data2vec with only 10K text sentences on the LibriSpeech speech recognition benchmark. We released the codes and pre-trained models.
- [Sep 2022] WavLM is published in IEEE Journal of Selected Topics in Signal Processing.
- [Jan 2022] WavLM ranks 1st in the VoxSRC 2021 speaker verification permanent leaderboard.
- [Dec 2021] WavLM demo of speaker verification is on Huggingface.
- [Nov 2021] WavLM codes and pre-trained models are released here.
- [Oct 2021] WavLM ranks 1st in the SUPERB leaderboard.
- [Oct 2021] WavLM, a large-scale self-supervised pre-training framework for full-stack speech processing, achieves state-of-the-art performance on 19 tasks, including all the 15 tasks on SUPERB benchmark, VoxCeleb1 speaker verification benchmark, LibriCSS speech separation benchmark, CALLHOME speech diarization benchmark and LibriSpeech speech recognition benchmark.
- [Oct 2021] Ultra fast continuous speech separation model is shipped in the Microsoft Conversation Transcription Service.
- [Dec 2020] Our continuous speech separation model is shipped in the Microsoft Conversation Transcription Service.
- [Oct 2020] Microsoft speaker diarization system with conformer-based continuous speech separation ranks 1st in the VoxCeleb Speaker Recognition Challenge 2020.
- [Aug 2020] Continuous speech separation with conformer achieves state-of-the-art performance on the LibriCSS speech separation benchmark. We released the codes and pre-trained models. See demos here.
- [Apr 2020] RecAdam, my 1st first-author paper, achieves state-of-the-art performance on the GLUE benchmark. We released the codes.

sanyuan-chen Goto Github PK

Hi, I'm Sanyuan Chen 👋

Sanyuan Chen (陈三元)'s Projects

Recommend Projects

Recommend Topics

Recommend Org