maskgct / maskgct.github.io
MaskGCT demo page
☆12Updated 3 weeks ago
Related projects ⓘ
Alternatives and complementary repositories for maskgct.github.io
- We Speech Transcript based on LLM, in 300 lines of code.☆126Updated 2 months ago
- SenseVoice-python: A enterprise-grade open source multi-language asr system from funasr opensource with onnxruntime☆73Updated last month
- Next-generation TTS model using flow-matching and DiT, inspired by Stable Diffusion 3☆362Updated 2 months ago
- a text-conditional diffusion probabilistic model capable of generating high fidelity audio.☆124Updated 5 months ago
- ☆65Updated 11 months ago
- A lightweight end-to-end text-to-speech model☆91Updated last month
- VC Without Retrain!☆102Updated 6 months ago
- ☆166Updated last month
- text to speech using autoregressive transformer and VITS☆227Updated 7 months ago
- A toolkit for speaker diarization.☆141Updated this week
- Unoffical implementation of Megatts2☆264Updated 7 months ago
- An Open-Sourced LLM-empowered Foundation TTS System☆424Updated 3 weeks ago
- MooER: Moore-threads Open Omni model for speech-to-speech intERaction. MooER-omni includes a series of end-to-end speech interaction mode…☆156Updated last week
- Open source inference code for Rev's model☆331Updated 2 weeks ago
- FoleyCrafter: Bring Silent Videos to Life with Lifelike and Synchronized Sounds. AI拟音大师,给你的无声视频添加生动而且同步的音效 😝☆460Updated 3 months ago
- The reproduced code for Google's SoundStorm☆253Updated last year
- [ICASSP 2024] This is the official code for "VoiceFlow: Efficient Text-to-Speech with Rectified Flow Matching"☆306Updated 2 months ago
- QuickVC: Any-to-many Voice Conversion Using Inverse Short-time Fourier Transform for Faster Conversion☆226Updated last year
- Diffusion Singing Voice Conversion based on Grad-TTS from HuaWei☆130Updated last year
- API for a Vocal Remover that uses Deep Neural Networks.☆85Updated 4 months ago
- Preprocess Audio for training☆253Updated last month
- Anim-400K: A dataset designed from the ground up for automated dubbing of video☆99Updated 4 months ago
- ☆171Updated 11 months ago
- High-quality Text-to-Audio Generation with Efficient Diffusion Transformer☆234Updated 3 weeks ago
- 实现基于4k视频的高分辨率人物换衣、虚拟试穿、物品替换☆50Updated 2 years ago
- ☆257Updated 5 months ago
- Reverse Engineering of Supervised Semantic Speech Tokenizer (S3Tokenizer) proposed in CosyVoice☆131Updated last month
- Pseudo Streaming SenseVoice with Hotwords☆78Updated last week
- Bert-vits2-V2.3 训练和推理☆43Updated 8 months ago