haidog-yaqub / EzAudio
High-quality Text-to-Audio Generation with Efficient Diffusion Transformer
☆234Updated 3 weeks ago
Related projects ⓘ
Alternatives and complementary repositories for EzAudio
- StyleTTS-ZS: Efficient High-Quality Zero-Shot Text-to-Speech Synthesis with Distilled Time-Varying Style Diffusion☆158Updated last month
- ☆252Updated 7 months ago
- LlamaVoice is a llama-based large voice generation model, providing inference and training ability.☆221Updated 2 months ago
- Awesome music generation model——MG²☆104Updated this week
- Next-generation TTS model using flow-matching and DiT, inspired by Stable Diffusion 3☆362Updated last month
- Generative models for conditional audio generation☆117Updated 2 months ago
- Fine-tune Stable Audio Open with DiT ControlNet.☆176Updated 2 months ago
- Implementation of E2-TTS, "Embarrassingly Easy Fully Non-Autoregressive Zero-Shot TTS", in Pytorch☆340Updated last week
- Interface for OuteTTS models.☆347Updated this week
- ☆303Updated 2 months ago
- a text-conditional diffusion probabilistic model capable of generating high fidelity audio.☆124Updated 5 months ago
- OpenMusic: SOTA Text-to-music (TTM) Generation☆475Updated this week
- Metrics for evaluating music and audio generative models – with a focus on long-form, full-band, and stereo generations.☆150Updated 3 months ago
- Daily tracking of awesome audio papers, including music generation, zero-shot tts, asr, audio generation☆348Updated this week
- The official Implementation of PeriodWave and PeriodWave-Turbo☆129Updated 2 months ago
- ☆139Updated 3 weeks ago
- Official codes and models of the paper "Auffusion: Leveraging the Power of Diffusion and Large Language Models for Text-to-Audio Generati…☆154Updated 7 months ago
- 🐍 🤖 Pip installable package for StyleTTS 2 human-level text-to-speech and voice cloning☆135Updated 3 months ago
- VALL-E 2 reproduction☆83Updated 3 months ago
- ☆62Updated 3 weeks ago
- Collection of Open Source Speech Data☆144Updated this week
- FoleyCrafter: Bring Silent Videos to Life with Lifelike and Synchronized Sounds. AI拟音大师,给你的无声视频添加生动而且同步的音效 😝☆460Updated 3 months ago
- VoiceRestore: Flow-Matching Transformers for Universal Speech Restoration☆81Updated last month
- An open source implementation of Microsoft's VALL-E X zero-shot TTS model. Demo is available in https://plachtaa.github.io☆66Updated last year
- Official Implementation of StyleTTS☆398Updated 11 months ago
- ☆176Updated last month
- Refactored / updated version of `stable-audio-tools` which is an open-source code for audio/music generative models originally by Stabili…☆143Updated 3 months ago
- [ICASSP 2024] This is the official code for "VoiceFlow: Efficient Text-to-Speech with Rectified Flow Matching"☆306Updated 2 months ago
- Automatically Update Text-to-speech (TTS) Papers Daily using Github Actions (Update Every 12th hours)☆277Updated this week
- An Open-Sourced LLM-empowered Foundation TTS System☆424Updated 3 weeks ago