YUCHEN005 / UniVPM
Code for paper "Hearing Lips in Noise: Universal Viseme-Phoneme Mapping and Transfer for Robust Audio-Visual Speech Recognition"
☆21Updated last year
Alternatives and similar repositories for UniVPM:
Users that are interested in UniVPM are comparing it to the libraries listed below
- Code for paper "Unsupervised Noise adaptation using Data Simulation"☆12Updated 8 months ago
- ☆20Updated 11 months ago
- Official release of StyleTalk dataset.☆60Updated 7 months ago
- Emotion Rendering for Conversational Speech Synthesis with Heterogeneous Graph-Based Context Modeling (Accepted by AAAI'2024)☆53Updated 7 months ago
- ☆50Updated last year
- Generative Expressive Conversational Speech Synthesis (Accepted by MM'2024)☆60Updated 2 months ago
- The Official Implementation of “Content-Dependent Fine-Grained Speaker Embedding for Zero-Shot Speaker Adaptation in Text-to-Speech Synth…☆82Updated 2 years ago
- ☆28Updated 11 months ago
- [Findings of NAACL 2024] Source code of paper CM-TTS: Enhancing Real Time Text-to-Speech Synthesis Efficiency through Weighted Samplers a…☆65Updated 10 months ago
- TextrolSpeech: A Text Style Control Speech Corpus With Codec Language Text-to-Speech Models (2024 ICASSP)☆150Updated 2 months ago
- YOLO-Stutter: End-to-end Region-Wise Speech Dysfluency Detection☆12Updated 3 months ago
- Models and codes for INTERSPEECH 2023 paper DistilXLSR: A Light Weight Cross-Lingual Speech Representation Model☆10Updated last year
- X-E-Speech: Joint Training Framework of Non-Autoregressive Cross-lingual Emotional Text-to-Speech and Voice Conversion☆79Updated 9 months ago
- This repository contains official pytorch implementation and pre-trained models for the MR-RawNet.☆12Updated 7 months ago
- Code for paper "Unifying Speech Enhancement and Separation with Gradient Modulation for End-to-End Noise-Robust Speech Separation"☆42Updated 6 months ago
- ☆65Updated last year
- [Interspeech 2023] Intelligible Lip-to-Speech Synthesis with Speech Units☆27Updated 3 months ago
- Code for paper "Dual-Path Style Learning for End-to-End Noise-Robust Speech Recognition"☆39Updated last year
- Unified Speech Language Model for paper "SpeechTokenizer: Unified Speech Tokenizer for Speech Large Language Models"(ICLR 2024)☆140Updated last year
- ☆30Updated last year
- The official repository of SpeechCraft dataset, a large-scale expressive bilingual speech dataset with natural language descriptions.☆82Updated 3 weeks ago
- The official implementation of EmoSphere++☆70Updated last week
- VoxInstruct: Expressive Human Instruction-to-Speech Generation with Unified Multilingual Codec Language Modelling☆61Updated 2 months ago
- ☆63Updated 4 months ago
- INTERSPEECH 2023: "DPHuBERT: Joint Distillation and Pruning of Self-Supervised Speech Models"☆110Updated last year
- Code for paper "Gradient Remedy for Multi-Task Learning in End-to-End Noise-Robust Speech Recognition"☆18Updated last year
- [WIP] Unofficial Implementation of Microsoft's PromptTTS2☆51Updated last year
- Code for ACL 2024 main conference paper "Can We Achieve High-quality Direct Speech-to-Speech Translation Without Parallel Speech Data?".☆24Updated 6 months ago
- Unofficial SoundStream implementation of Pytorch with training code and 16kHz pretrained checkpoint☆62Updated last year
- Official repo for CoVoMix: Advancing Zero-Shot Speech Generation for Human-like Multi-talker Conversations☆44Updated 2 weeks ago