Stability-AI / stable-audio-tools
Generative models for conditional audio generation
☆2,724Updated 2 weeks ago
Related projects ⓘ
Alternatives and complementary repositories for stable-audio-tools
- Text-to-Audio/Music Generation☆2,306Updated last month
- TTS Generation Web UI (Bark, MusicGen + AudioGen, Tortoise, RVC, Vocos, Demucs, SeamlessM4T, MAGNet, StyleTTS2, MMS, Stable Audio, Mars5,…☆1,826Updated this week
- Official implementation of "Separate Anything You Describe"☆1,632Updated 3 weeks ago
- AI powered speech denoising and enhancement☆1,432Updated 2 weeks ago
- A family of diffusion models for text-to-audio generation.☆1,087Updated 4 months ago
- Stable diffusion for real-time music generation☆3,413Updated 3 months ago
- Foundational model for human-like, expressive TTS☆3,895Updated 3 months ago
- Audiocraft is a library for audio processing and generation with deep learning. It features the state-of-the-art EnCodec audio compressor…☆555Updated 3 months ago
- A webui for different audio related Neural Networks☆1,079Updated 3 months ago
- AudioLDM: Generate speech, sound effects, music and beyond, with text.☆2,451Updated last month
- [ECCV 2024, Oral] DynamiCrafter: Animating Open-domain Images with Video Diffusion Priors☆2,589Updated 2 months ago
- Official implementations for paper: DreamTalk: When Expressive Talking Head Generation Meets Diffusion Probabilistic Models☆1,617Updated 10 months ago
- StyleTTS 2: Towards Human-Level Text-to-Speech through Style Diffusion and Adversarial Training with Large Speech Language Models☆4,962Updated 3 months ago
- MusePose: a Pose-Driven Image-to-Video Framework for Virtual Human Generation☆2,278Updated 3 months ago
- Character Animation (AnimateAnyone, Face Reenactment)☆3,185Updated 5 months ago
- A simple, high-quality voice conversion tool focused on ease of use and performance☆1,804Updated this week
- ✨ Hotshot-XL: State-of-the-art AI text-to-GIF model trained to work alongside Stable Diffusion XL☆1,064Updated 9 months ago
- Versatile audio super resolution (any -> 48kHz) with AudioSR.☆1,163Updated 6 months ago
- ☆646Updated 2 weeks ago
- Implementation of AudioLM, a SOTA Language Modeling Approach to Audio Generation out of Google Research, in Pytorch☆2,444Updated last week
- Audio generation using diffusion models, in PyTorch.☆1,963Updated last year
- The code for the bark-voicecloning model. Training and inference.☆668Updated last year
- Implementation of SoundStorm, Efficient Parallel Audio Generation from Google Deepmind, in Pytorch☆1,427Updated 2 weeks ago
- 🔊 Text-Prompted Generative Audio Model with Gradio☆674Updated 11 months ago
- High-Quality Human Motion Video Generation with Confidence-aware Pose Guidance☆1,899Updated last month
- ComfyUI nodes for LivePortrait☆1,648Updated 3 months ago
- AllTalk is based on the Coqui TTS engine, similar to the Coqui_tts extension for Text generation webUI, however supports a variety of adv…☆1,120Updated this week
- animatediff prompt travel☆1,191Updated 10 months ago
- PixArt-α: Fast Training of Diffusion Transformer for Photorealistic Text-to-Image Synthesis☆2,809Updated 3 weeks ago
- ☆1,094Updated 5 months ago