About me

Wei-Ning is a research scientist at Facebook AI Research (FAIR). His research focuses on representation learning, self-supervised learning, and structured generative modeling for unimodal and multimodal speech. He is passionate about reducing the supervision required for various speech applications and developing technologies applicable to both written and unwritten languages.

Prior to joining Facebook. Wei-Ning received his Ph.D. and S.M. degrees in Electrical Engineering and Computer Science from Massachusetts Institute of Technology in 2020 and 2018, under the supervision of Dr. James Glass. He received his B.S. degree in Electrical Engineering from National Taiwan University in 2014, under the supervision of Prof. Lin-shan Lee and Prof. Hsuan-Tien Lin.

Recent Research Highlight

Audio-Visual HuBERT: the first self-supervised model for audio-visual speech, achieving state-of-the-art performance on lip-reading, speech recognition, and audio-visual speech recognition with much less labeled data
data2vec: The first high-performance self-supervised algorithm that works for speech, vision, and text
Textless Speech-to-Speech Translation on Real Data: first ever text-free speech-to-speech translation model trained on real data that is on par with text-based models
wav2vec-U: an unsupervised speech recognition framework that rivals the best supervised model from 2 years ago and works for 10 languages
Textless NLP: a model that can do prompted or unprompted speech generation without using any text (like audio-version of GPT-2)
HuBERT: a state-of-the-art self-supervised speech representation learning model for recognition, generation, and compression.

Wei-Ning Hsu (徐煒甯)

Recent Research Highlight