Audiocraft is a library for audio processing and generation with deep learning. It features the state-of-the-art EnCodec audio compressor / tokenizer, along with MusicGen, a simple and controllable music generation LM with textual and melodic conditioning.
β23,177Mar 3, 2026Updated last month
Alternatives and similar repositories for audiocraft
Users that are interested in audiocraft are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- π Text-Prompted Generative Audio Modelβ39,073Aug 19, 2024Updated last year
- Text-to-Audio/Music Generationβ2,609Sep 29, 2024Updated last year
- πΈπ¬ - a deep learning toolkit for Text-to-Speech, battle-tested in research and productionβ45,043Aug 16, 2024Updated last year
- Foundational Models for State-of-the-Art Speech and Text Translationβ11,769Apr 8, 2026Updated last week
- State-of-the-art deep learning based audio codec supporting both mono 24 kHz audio and stereo 48 kHz audio.β3,935Jan 4, 2024Updated 2 years ago
- 1-Click AI Models by DigitalOcean Gradient β’ AdDeploy popular AI models on DigitalOcean Gradient GPU virtual machines with just a single click. Zero configuration with optimized deployments.
- Generative models for conditional audio generationβ3,664Feb 14, 2026Updated 2 months ago
- Implementation of MusicLM, Google's new SOTA model for music generation using attention networks, in Pytorchβ3,292Sep 6, 2023Updated 2 years ago
- AudioGPT: Understanding and Generating Speech, Music, Sound, and Talking Headβ10,195Jul 6, 2024Updated last year
- Instant voice cloning by MIT and MyShell. Audio foundation model.β36,216Apr 19, 2025Updated 11 months ago
- AudioLDM: Generate speech, sound effects, music and beyond, with text.β2,854Jun 25, 2025Updated 9 months ago
- Implementation of AudioLM, a SOTA Language Modeling Approach to Audio Generation out of Google Research, in Pytorchβ2,619Jan 12, 2025Updated last year
- Robust Speech Recognition via Large-Scale Weak Supervisionβ97,885Updated this week
- Generative Models by Stability AIβ27,076Dec 16, 2025Updated 4 months ago
- Amphion (/Γ¦mΛfaΙͺΙn/) is a toolkit for Audio, Music, and Speech Generation. Its purpose is to support reproducible research and help junioβ¦β9,754Mar 25, 2026Updated 3 weeks ago
- Wordpress hosting with auto-scaling - Free Trial β’ AdFully Managed hosting for WordPress and WooCommerce businesses that need reliable, auto-scalable performance. Cloudways SafeUpdates now available.
- Muzic: Music Understanding and Generation with Artificial Intelligenceβ4,906Oct 12, 2024Updated last year
- Official implementation of AnimateDiff.β12,097Jul 31, 2024Updated last year
- The original local LLM interface. Text, vision, tool-calling, training. UI + API, 100% offline and private.β46,493Updated this week
- Stable diffusion for real-time music generationβ3,895Jul 22, 2024Updated last year
- Official Code for DragGAN (SIGGRAPH 2023)β35,905May 18, 2024Updated last year
- Audio generation using diffusion models, in PyTorch.β2,100Jun 12, 2023Updated 2 years ago
- An open source implementation of Microsoft's VALL-E X zero-shot TTS model. Demo is available in https://plachtaa.github.io/vallex/β7,956Feb 11, 2024Updated 2 years ago
- The most powerful and modular diffusion model GUI, api and backend with a graph/nodes interface.β108,818Updated this week
- An open platform for training, serving, and evaluating large language models. Release repo for Vicuna and Chatbot Arena.β39,448Jun 2, 2025Updated 10 months ago
- Wordpress hosting with auto-scaling - Free Trial β’ AdFully Managed hosting for WordPress and WooCommerce businesses that need reliable, auto-scalable performance. Cloudways SafeUpdates now available.
- Let us control diffusion models!β33,805Feb 25, 2024Updated 2 years ago
- Stable Diffusion web UIβ162,336Mar 2, 2026Updated last month
- Inference code for Llama modelsβ59,324Jan 26, 2025Updated last year
- LLM inference in C/C++β103,237Updated this week
- GPT4All: Run Local LLMs on Any Device. Open-source and available for commercial use.β77,328May 27, 2025Updated 10 months ago
- [NeurIPS'23 Oral] Visual Instruction Tuning (LLaVA) built towards GPT-4V level capabilities and beyond.β24,675Aug 12, 2024Updated last year
- StableLM: Stability AI Language Modelsβ15,728Apr 8, 2024Updated 2 years ago
- OpenAssistant is a chat-based assistant that understands tasks, can interact with third-party systems, and retrieve information dynamicalβ¦β37,418Aug 17, 2024Updated last year
- Invoke is a leading creative engine for Stable Diffusion models, empowering professionals, artists, and enthusiasts to generate and creatβ¦β26,990Updated this week
- Serverless GPU API endpoints on Runpod - Bonus Credits β’ AdSkip the infrastructure headaches. Auto-scaling, pay-as-you-go, no-ops approach lets you focus on innovating your application.
- CLI platform to experiment with codegen. Precursor to: https://lovable.devβ55,216May 14, 2025Updated 11 months ago
- The agent engineering platformβ133,136Apr 11, 2026Updated last week
- Universal LLM Deployment Engine with ML Compilationβ22,414Apr 6, 2026Updated last week
- β30,491Mar 13, 2026Updated last month
- A multi-voice TTS system trained with an emphasis on qualityβ14,832Nov 19, 2024Updated last year
- State-of-the-art audio codec with 90x compression factor. Supports 44.1kHz, 24kHz, and 16kHz mono/stereo audio.β1,766Jan 26, 2026Updated 2 months ago
- π€ Diffusers: State-of-the-art diffusion models for image, video, and audio generation in PyTorch.β33,336Updated this week