Stability-AI / stable-audio-toolsLinks
Generative models for conditional audio generation
☆3,335Updated 3 weeks ago
Alternatives and similar repositories for stable-audio-tools
Users that are interested in stable-audio-tools are comparing it to the libraries listed below
Sorting:
- Text-to-Audio/Music Generation☆2,446Updated 8 months ago
- Stable diffusion for real-time music generation☆3,730Updated 11 months ago
- AudioLDM: Generate speech, sound effects, music and beyond, with text.☆2,680Updated 3 weeks ago
- [ECCV 2024, Oral] DynamiCrafter: Animating Open-domain Images with Video Diffusion Priors☆2,877Updated 9 months ago
- Foundational model for human-like, expressive TTS☆4,132Updated 10 months ago
- Official Code for Stable Cascade☆6,589Updated 10 months ago
- Audiocraft is a library for audio processing and generation with deep learning. It features the state-of-the-art EnCodec audio compressor…☆614Updated 10 months ago
- Audio generation using diffusion models, in PyTorch.☆2,058Updated 2 years ago
- Official implementation of "Separate Anything You Describe"☆1,752Updated 6 months ago
- Di♪♪Rhythm: Blazingly Fast and Embarrassingly Simple End-to-End Full-Length Song Generation with Latent Diffusion☆1,729Updated 3 weeks ago
- A single Gradio + React WebUI with extensions for ACE-Step, Kimi Audio, Piper TTS, GPT-SoVITS, CosyVoice, XTTSv2, DIA, Kokoro, OpenVoice,…☆2,286Updated this week
- A simple, high-quality voice conversion tool focused on ease of use and performance.☆2,428Updated this week
- Inference and training library for high-quality TTS models.☆5,314Updated 6 months ago
- The image prompt adapter is designed to enable a pretrained text-to-image diffusion model to generate images with image prompt.☆6,052Updated 11 months ago
- PixArt-α: Fast Training of Diffusion Transformer for Photorealistic Text-to-Image Synthesis☆3,113Updated 7 months ago
- AnimateDiff for AUTOMATIC1111 Stable Diffusion WebUI☆3,317Updated 9 months ago
- Official implementation of AnimateDiff.☆11,520Updated 10 months ago
- Lumina-T2X is a unified framework for Text to Any Modality Generation☆2,198Updated 4 months ago
- Implementation of AudioLM, a SOTA Language Modeling Approach to Audio Generation out of Google Research, in Pytorch☆2,558Updated 5 months ago
- InspireMusic: A toolkit designed for music, song, and audio generation☆1,122Updated last month
- Amphion (/æmˈfaɪən/) is a toolkit for Audio, Music, and Speech Generation. Its purpose is to support reproducible research and help junio…☆9,174Updated 3 weeks ago
- A general fine-tuning kit geared toward diffusion models.☆2,386Updated this week
- AI powered speech denoising and enhancement☆1,848Updated 6 months ago
- Text-to-Music Generation with Rectified Flow Transformers☆1,701Updated 6 months ago
- ACE-Step: A Step Towards Music Generation Foundation Model☆2,512Updated 3 weeks ago
- Character Animation (AnimateAnyone, Face Reenactment)☆3,404Updated last year
- Official implementations for paper: DreamTalk: When Expressive Talking Head Generation Meets Diffusion Probabilistic Models☆1,735Updated last year
- StyleTTS 2: Towards Human-Level Text-to-Speech through Style Diffusion and Adversarial Training with Large Speech Language Models☆5,799Updated 10 months ago
- Transparent Image Layer Diffusion using Latent Transparency☆2,133Updated last year
- A family of diffusion models for text-to-audio generation.☆1,174Updated 5 months ago