Audiocraft is a library for audio processing and generation with deep learning. It features the state-of-the-art EnCodec audio compressor / tokenizer, along with MusicGen, a simple and controllable music generation LM with textual and melodic conditioning.
β23,204Mar 3, 2026Updated last month
Alternatives and similar repositories for audiocraft
Users that are interested in audiocraft are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- π Text-Prompted Generative Audio Modelβ39,086Aug 19, 2024Updated last year
- Text-to-Audio/Music Generationβ2,613Sep 29, 2024Updated last year
- πΈπ¬ - a deep learning toolkit for Text-to-Speech, battle-tested in research and productionβ45,116Aug 16, 2024Updated last year
- Foundational Models for State-of-the-Art Speech and Text Translationβ11,775Apr 8, 2026Updated 2 weeks ago
- State-of-the-art deep learning based audio codec supporting both mono 24 kHz audio and stereo 48 kHz audio.β3,941Jan 4, 2024Updated 2 years ago
- Open source password manager - Proton Pass β’ AdSecurely store, share, and autofill your credentials with Proton Pass, the end-to-end encrypted password manager trusted by millions.
- Generative models for conditional audio generationβ3,671Feb 14, 2026Updated 2 months ago
- Implementation of MusicLM, Google's new SOTA model for music generation using attention networks, in Pytorchβ3,292Sep 6, 2023Updated 2 years ago
- AudioGPT: Understanding and Generating Speech, Music, Sound, and Talking Headβ10,193Jul 6, 2024Updated last year
- Instant voice cloning by MIT and MyShell. Audio foundation model.β36,243Apr 19, 2025Updated last year
- AudioLDM: Generate speech, sound effects, music and beyond, with text.β2,861Jun 25, 2025Updated 9 months ago
- Implementation of AudioLM, a SOTA Language Modeling Approach to Audio Generation out of Google Research, in Pytorchβ2,619Jan 12, 2025Updated last year
- Robust Speech Recognition via Large-Scale Weak Supervisionβ97,885Apr 15, 2026Updated last week
- Generative Models by Stability AIβ27,099Dec 16, 2025Updated 4 months ago
- Amphion (/Γ¦mΛfaΙͺΙn/) is a toolkit for Audio, Music, and Speech Generation. Its purpose is to support reproducible research and help junioβ¦β9,774Mar 25, 2026Updated 3 weeks ago
- Serverless GPU API endpoints on Runpod - Get Bonus Credits β’ AdSkip the infrastructure headaches. Auto-scaling, pay-as-you-go, no-ops approach lets you focus on innovating your application.
- Muzic: Music Understanding and Generation with Artificial Intelligenceβ4,906Oct 12, 2024Updated last year
- Official implementation of AnimateDiff.β12,097Jul 31, 2024Updated last year
- The original local LLM interface. Text, vision, tool-calling, training. UI + API, 100% offline and private.β46,836Updated this week
- Stable diffusion for real-time music generationβ3,895Jul 22, 2024Updated last year
- Official Code for DragGAN (SIGGRAPH 2023)β35,905May 18, 2024Updated last year
- Audio generation using diffusion models, in PyTorch.β2,102Jun 12, 2023Updated 2 years ago
- An open source implementation of Microsoft's VALL-E X zero-shot TTS model. Demo is available in https://plachtaa.github.io/vallex/β7,957Feb 11, 2024Updated 2 years ago
- The most powerful and modular diffusion model GUI, api and backend with a graph/nodes interface.β109,561Updated this week
- An open platform for training, serving, and evaluating large language models. Release repo for Vicuna and Chatbot Arena.β39,451Jun 2, 2025Updated 10 months ago
- Wordpress hosting with auto-scaling - Free Trial Offer β’ AdFully Managed hosting for WordPress and WooCommerce businesses that need reliable, auto-scalable performance. Cloudways SafeUpdates now available.
- Let us control diffusion models!β33,805Feb 25, 2024Updated 2 years ago
- Stable Diffusion web UIβ162,488Mar 2, 2026Updated last month
- Inference code for Llama modelsβ59,349Jan 26, 2025Updated last year
- GPT4All: Run Local LLMs on Any Device. Open-source and available for commercial use.β77,355May 27, 2025Updated 10 months ago
- LLM inference in C/C++β104,862Updated this week
- [NeurIPS'23 Oral] Visual Instruction Tuning (LLaVA) built towards GPT-4V level capabilities and beyond.β24,707Aug 12, 2024Updated last year
- StableLM: Stability AI Language Modelsβ15,727Apr 8, 2024Updated 2 years ago
- OpenAssistant is a chat-based assistant that understands tasks, can interact with third-party systems, and retrieve information dynamicalβ¦β37,422Aug 17, 2024Updated last year
- Invoke is a leading creative engine for Stable Diffusion models, empowering professionals, artists, and enthusiasts to generate and creatβ¦β27,025Updated this week
- Virtual machines for every use case on DigitalOcean β’ AdGet dependable uptime with 99.99% SLA, simple security tools, and predictable monthly pricing with DigitalOcean's virtual machines, called Droplets.
- CLI platform to experiment with codegen. Precursor to: https://lovable.devβ55,215May 14, 2025Updated 11 months ago
- The agent engineering platformβ133,997Updated this week
- Universal LLM Deployment Engine with ML Compilationβ22,482Apr 14, 2026Updated last week
- β30,491Mar 13, 2026Updated last month
- A multi-voice TTS system trained with an emphasis on qualityβ14,843Nov 19, 2024Updated last year
- State-of-the-art audio codec with 90x compression factor. Supports 44.1kHz, 24kHz, and 16kHz mono/stereo audio.β1,772Jan 26, 2026Updated 2 months ago
- π€ Diffusers: State-of-the-art diffusion models for image, video, and audio generation in PyTorch.β33,412Updated this week