Audiocraft is a library for audio processing and generation with deep learning. It features the state-of-the-art EnCodec audio compressor / tokenizer, along with MusicGen, a simple and controllable music generation LM with textual and melodic conditioning.
☆23,423Mar 3, 2026Updated 3 months ago
Alternatives and similar repositories for audiocraft
Users that are interested in audiocraft are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- 🔊 Text-Prompted Generative Audio Model☆39,172Aug 19, 2024Updated last year
- Text-to-Audio/Music Generation☆2,633Sep 29, 2024Updated last year
- 🐸💬 - a deep learning toolkit for Text-to-Speech, battle-tested in research and production☆45,642Aug 16, 2024Updated last year
- Foundational Models for State-of-the-Art Speech and Text Translation☆11,808Apr 8, 2026Updated 2 months ago
- State-of-the-art deep learning based audio codec supporting both mono 24 kHz audio and stereo 48 kHz audio.☆3,980Jan 4, 2024Updated 2 years ago
- GPUs on demand by Runpod - Special Offer Available • AdRun AI, ML, and HPC workloads on powerful cloud GPUs—without limits or wasted spend. Deploy GPUs in under a minute and pay by the second.
- Generative models for conditional audio generation☆3,800Jun 20, 2026Updated last week
- Implementation of MusicLM, Google's new SOTA model for music generation using attention networks, in Pytorch☆3,293Sep 6, 2023Updated 2 years ago
- AudioGPT: Understanding and Generating Speech, Music, Sound, and Talking Head☆10,176Jul 6, 2024Updated last year
- Instant voice cloning by MIT and MyShell. Audio foundation model.☆36,789Apr 19, 2025Updated last year
- AudioLDM: Generate speech, sound effects, music and beyond, with text.☆2,894Jun 25, 2025Updated last year
- Implementation of AudioLM, a SOTA Language Modeling Approach to Audio Generation out of Google Research, in Pytorch☆2,620Jan 12, 2025Updated last year
- Amphion (/æmˈfaɪən/) is a toolkit for Audio, Music, and Speech Generation. Its purpose is to support reproducible research and help junio…☆9,906Mar 25, 2026Updated 3 months ago
- Robust Speech Recognition via Large-Scale Weak Supervision☆103,646Apr 15, 2026Updated 2 months ago
- Generative Models by Stability AI☆27,205Dec 16, 2025Updated 6 months ago
- 1-Click AI Models by DigitalOcean Gradient • AdDeploy popular AI models on DigitalOcean Gradient GPU virtual machines with just a single click. Zero configuration with optimized deployments.
- Muzic: Music Understanding and Generation with Artificial Intelligence☆4,928Oct 12, 2024Updated last year
- Official implementation of AnimateDiff.☆12,162Jul 31, 2024Updated last year
- Open-source desktop app for local LLMs. Text, vision, tool-calling, OpenAI/Anthropic-compatible API. 100% private.☆47,390Jun 2, 2026Updated last month
- Stable diffusion for real-time music generation☆3,896Jul 22, 2024Updated last year
- Official Code for DragGAN (SIGGRAPH 2023)☆35,814May 18, 2024Updated 2 years ago
- Audio generation using diffusion models, in PyTorch.☆2,100Jun 12, 2023Updated 3 years ago
- An open source implementation of Microsoft's VALL-E X zero-shot TTS model. Demo is available in https://plachtaa.github.io/vallex/☆7,936Feb 11, 2024Updated 2 years ago
- The most powerful and modular diffusion model GUI, api and backend with a graph/nodes interface.☆118,878Updated this week
- An open platform for training, serving, and evaluating large language models. Release repo for Vicuna and Chatbot Arena.☆39,492May 1, 2026Updated 2 months ago
- 1-Click AI Models by DigitalOcean Gradient • AdDeploy popular AI models on DigitalOcean Gradient GPU virtual machines with just a single click. Zero configuration with optimized deployments.
- Let us control diffusion models!☆33,965Feb 25, 2024Updated 2 years ago
- Stable Diffusion web UI☆163,930Mar 2, 2026Updated 4 months ago
- Inference code for Llama models☆59,475Jan 26, 2025Updated last year
- GPT4All: Run Local LLMs on Any Device. Open-source and available for commercial use.☆77,382May 27, 2025Updated last year
- [NeurIPS'23 Oral] Visual Instruction Tuning (LLaVA) built towards GPT-4V level capabilities and beyond.☆24,896Aug 12, 2024Updated last year
- LLM inference in C/C++☆118,422Updated this week
- StableLM: Stability AI Language Models☆15,696Apr 8, 2024Updated 2 years ago
- OpenAssistant is a chat-based assistant that understands tasks, can interact with third-party systems, and retrieve information dynamical…☆37,392Aug 17, 2024Updated last year
- Invoke is a leading creative engine for Stable Diffusion models, empowering professionals, artists, and enthusiasts to generate and creat…☆27,533Updated this week
- Managed hosting for WordPress and PHP on Cloudways • AdManaged hosting for WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Deploy in minutes on Cloudways by DigitalOcean.
- CLI platform to experiment with codegen. Precursor to: https://lovable.dev☆55,204May 14, 2025Updated last year
- The agent engineering platform.☆140,319Updated this week
- Universal LLM Deployment Engine with ML Compilation☆22,863May 11, 2026Updated last month
- State-of-the-art audio codec with 90x compression factor. Supports 44.1kHz, 24kHz, and 16kHz mono/stereo audio.☆1,828Jan 26, 2026Updated 5 months ago
- A multi-voice TTS system trained with an emphasis on quality☆14,865Nov 19, 2024Updated last year
- 🤗 Diffusers: State-of-the-art diffusion models for image, video, and audio generation in PyTorch.☆33,960Updated this week
- Industry leading face manipulation platform☆29,071Jun 24, 2026Updated last week