Project page for "Improving Few-shot Learning for Talking Face System with TTS Data Augmentation" for ICASSP2023
☆86Oct 10, 2023Updated 2 years ago
Alternatives and similar repositories for T2A
Users that are interested in T2A are comparing it to the libraries listed below
Sorting:
- [INTERSPEECH 2023 Best Paper Shortlist] Official implementation for MT4SSL: Boosting Self-Supervised Speech Representation Learning by In…☆45Mar 25, 2024Updated last year
- Source code and speech samples for the DSU-AVO paper accepted to INTERSPEECH 2023☆12May 13, 2024Updated last year
- (R&D) Text to speech using phonemes as inputs and audio codec codes as outputs. Loosely based on MegaByte, VALL-E and Encodec.☆48Sep 4, 2023Updated 2 years ago
- Code for vec2wav 2.0, a speech token vocoder for VC. Paper: https://arxiv.org/abs/2409.01995☆78Dec 3, 2024Updated last year
- ☆101Oct 30, 2025Updated 4 months ago
- Code for paper 'EAMM: One-Shot Emotional Talking Face via Audio-Based Emotion-Aware Motion Model'☆201Apr 28, 2023Updated 2 years ago
- ☆428Nov 1, 2023Updated 2 years ago
- BigVGAN with Neural Source-Filter☆56Sep 21, 2023Updated 2 years ago
- ICASSP 2024 - Generative De-Quantization for Neural Speech Codec via Latent Diffusion.☆55Nov 16, 2025Updated 3 months ago
- Official implementation of Meta-StyleSpeech and StyleSpeech☆252Feb 9, 2022Updated 4 years ago
- [IJCAI'23] Learning to Speak from Text for Low-Resource TTS☆64May 30, 2023Updated 2 years ago
- ICASSP 2023 Accepted☆190May 6, 2024Updated last year
- Code for "Distribution-based Emotion Recognition in Conversation"☆19Feb 6, 2023Updated 3 years ago
- Official implementation of EdiTTS: Score-based Editing for Controllable Text-to-Speech (INTERSPEECH 2022)☆121Jan 24, 2023Updated 3 years ago
- The open source code for LLM-Codec☆145Aug 18, 2024Updated last year
- ☆39Apr 15, 2024Updated last year
- SyncTalkFace: Talking Face Generation for Precise Lip-syncing via Audio-Lip Memory☆33Nov 3, 2022Updated 3 years ago
- ☆25Mar 12, 2022Updated 3 years ago
- The source code of our paper "Diffsound: discrete diffusion model for text-to-sound generation"☆366Aug 3, 2023Updated 2 years ago
- X-E-Speech: Joint Training Framework of Non-Autoregressive Cross-lingual Emotional Text-to-Speech and Voice Conversion☆111Apr 1, 2024Updated last year
- PyTorch Implementation of Google Brain's WaveGrad 2: Iterative Refinement for Text-to-Speech Synthesis☆69Aug 3, 2021Updated 4 years ago
- Code for paper A3T: Alignment-Aware Acoustic and Text Pretraining for Speech Synthesis and Editing☆89Sep 6, 2024Updated last year
- Paper, Code and Statistics for Self-Supervised Learning and Pre-Training on Speech.☆211Jan 18, 2024Updated 2 years ago
- Audio-Visual Lip Synthesis via Intermediate Landmark Representation☆18May 16, 2023Updated 2 years ago
- ☆163Sep 19, 2022Updated 3 years ago
- ☆526Dec 26, 2023Updated 2 years ago
- Contains the code associated with the ICLR submission for our text-to-speech diffusion model☆57Oct 31, 2023Updated 2 years ago
- ☆134Feb 4, 2023Updated 3 years ago
- Connected Papers knockoff, managing academic papers and citations with graph database.☆12Dec 26, 2023Updated 2 years ago
- Neural Lexicon Reader: Reduce Pronunciation Errors in End-to-end TTS by Leveraging External Textual Knowledge☆21Jul 25, 2022Updated 3 years ago
- This repo contains conv-tasnet for basis-melgan. If you want to get code of basis-melgan, please refer to FastVocoder.☆21Jul 21, 2021Updated 4 years ago
- [AAAI 2024] Code for CTX-vec2wav in UniCATS☆130Jun 11, 2024Updated last year
- ☆276Jun 8, 2024Updated last year
- Speech samples and code of BEdit-TTS☆34Oct 8, 2023Updated 2 years ago
- ☆88Nov 1, 2022Updated 3 years ago
- High fidelity, lightweight, end-to-end, streaming, convolution-based neural audio codec☆115Jun 23, 2025Updated 8 months ago
- Models and code for RepCodec: A Speech Representation Codec for Speech Tokenization☆192Jul 12, 2024Updated last year
- ☆19Mar 22, 2024Updated last year
- ☆41May 15, 2023Updated 2 years ago