Audiocraft is a library for audio processing and generation with deep learning. It features the state-of-the-art EnCodec audio compressor / tokenizer, along with MusicGen, a simple and controllable music generation LM with textual and melodic conditioning.
β23,029Mar 13, 2025Updated 11 months ago
Alternatives and similar repositories for audiocraft
Users that are interested in audiocraft are comparing it to the libraries listed below
Sorting:
- π Text-Prompted Generative Audio Modelβ39,039Aug 19, 2024Updated last year
- πΈπ¬ - a deep learning toolkit for Text-to-Speech, battle-tested in research and productionβ44,691Aug 16, 2024Updated last year
- Foundational Models for State-of-the-Art Speech and Text Translationβ11,760Updated this week
- Text-to-Audio/Music Generationβ2,587Sep 29, 2024Updated last year
- AudioGPT: Understanding and Generating Speech, Music, Sound, and Talking Headβ10,206Jul 6, 2024Updated last year
- Instant voice cloning by MIT and MyShell. Audio foundation model.β36,049Apr 19, 2025Updated 10 months ago
- Generative models for conditional audio generationβ3,618Feb 14, 2026Updated 3 weeks ago
- Generative Models by Stability AIβ26,964Dec 16, 2025Updated 2 months ago
- Robust Speech Recognition via Large-Scale Weak Supervisionβ95,527Dec 15, 2025Updated 2 months ago
- State-of-the-art deep learning based audio codec supporting both mono 24 kHz audio and stereo 48 kHz audio.β3,901Jan 4, 2024Updated 2 years ago
- Amphion (/Γ¦mΛfaΙͺΙn/) is a toolkit for Audio, Music, and Speech Generation. Its purpose is to support reproducible research and help junioβ¦β9,706May 27, 2025Updated 9 months ago
- The definitive Web UI for local AI, with powerful features and easy setup.β46,130Updated this week
- Official Code for DragGAN (SIGGRAPH 2023)β35,972May 18, 2024Updated last year
- AudioLDM: Generate speech, sound effects, music and beyond, with text.β2,831Jun 25, 2025Updated 8 months ago
- Official implementation of AnimateDiff.β12,045Jul 31, 2024Updated last year
- An open platform for training, serving, and evaluating large language models. Release repo for Vicuna and Chatbot Arena.β39,426Jun 2, 2025Updated 9 months ago
- The most powerful and modular diffusion model GUI, api and backend with a graph/nodes interface.β104,884Updated this week
- [NeurIPS'23 Oral] Visual Instruction Tuning (LLaVA) built towards GPT-4V level capabilities and beyond.β24,500Aug 12, 2024Updated last year
- Let us control diffusion models!β33,692Feb 25, 2024Updated 2 years ago
- LLM inference in C/C++β96,322Mar 2, 2026Updated last week
- An open source implementation of Microsoft's VALL-E X zero-shot TTS model. Demo is available in https://plachtaa.github.io/vallex/β7,957Feb 11, 2024Updated 2 years ago
- OpenAssistant is a chat-based assistant that understands tasks, can interact with third-party systems, and retrieve information dynamicalβ¦β37,444Aug 17, 2024Updated last year
- Muzic: Music Understanding and Generation with Artificial Intelligenceβ4,900Oct 12, 2024Updated last year
- GPT4All: Run Local LLMs on Any Device. Open-source and available for commercial use.β77,171May 27, 2025Updated 9 months ago
- Stable Diffusion web UIβ161,451Mar 2, 2026Updated last week
- Inference code for Llama modelsβ59,183Jan 26, 2025Updated last year
- StableLM: Stability AI Language Modelsβ15,756Apr 8, 2024Updated last year
- Invoke is a leading creative engine for Stable Diffusion models, empowering professionals, artists, and enthusiasts to generate and creatβ¦β26,799Mar 2, 2026Updated last week
- Universal LLM Deployment Engine with ML Compilationβ22,129Updated this week
- The agent engineering platformβ128,595Updated this week
- π€ Diffusers: State-of-the-art diffusion models for image, video, and audio generation in PyTorch.β32,923Updated this week
- CLI platform to experiment with codegen. Precursor to: https://lovable.devβ55,212May 14, 2025Updated 9 months ago
- Focus on prompting and generatingβ47,794Dec 1, 2025Updated 3 months ago
- A natural language interface for computersβ62,529Feb 9, 2026Updated last month
- LlamaIndex is the leading document agent and OCR platformβ47,374Updated this week
- Industry leading face manipulation platformβ26,995Updated this week
- Stable diffusion for real-time music generationβ3,876Jul 22, 2024Updated last year
- one-click face swapβ30,547Aug 19, 2024Updated last year
- LLMs build upon Evol Insturct: WizardLM, WizardCoder, WizardMathβ9,478Jun 7, 2025Updated 9 months ago