facebookresearch / audiocraftLinks
Audiocraft is a library for audio processing and generation with deep learning. It features the state-of-the-art EnCodec audio compressor / tokenizer, along with MusicGen, a simple and controllable music generation LM with textual and melodic conditioning.
β22,694Updated 8 months ago
Alternatives and similar repositories for audiocraft
Users that are interested in audiocraft are comparing it to the libraries listed below
Sorting:
- π Text-Prompted Generative Audio Modelβ38,713Updated last year
- Stable diffusion for real-time music generationβ3,826Updated last year
- Foundational Models for State-of-the-Art Speech and Text Translationβ11,703Updated last year
- AudioGPT: Understanding and Generating Speech, Music, Sound, and Talking Headβ10,201Updated last year
- AudioLDM: Generate speech, sound effects, music and beyond, with text.β2,770Updated 4 months ago
- πΈπ¬ - a deep learning toolkit for Text-to-Speech, battle-tested in research and productionβ43,540Updated last year
- Community interface for generative AIβ9,035Updated last year
- Text-to-Audio/Music Generationβ2,520Updated last year
- Implementation of MusicLM, Google's new SOTA model for music generation using attention networks, in Pytorchβ3,285Updated 2 years ago
- Muzic: Music Understanding and Generation with Artificial Intelligenceβ4,871Updated last year
- An open source implementation of Microsoft's VALL-E X zero-shot TTS model. Demo is available in https://plachtaa.github.io/vallex/β7,975Updated last year
- Generative models for conditional audio generationβ3,507Updated last month
- Let us control diffusion models!β33,301Updated last year
- StableLM: Stability AI Language Modelsβ15,792Updated last year
- β7,841Updated last year
- A multi-voice TTS system trained with an emphasis on qualityβ14,704Updated last year
- Official implementation of AnimateDiff.β11,873Updated last year
- WebUI extension for ControlNetβ17,852Updated last year
- State-of-the-art deep learning based audio codec supporting both mono 24 kHz audio and stereo 48 kHz audio.β3,837Updated last year
- π Text-prompted Generative Audio Model - With the ability to clone voicesβ3,335Updated 2 months ago
- Stable diffusion for real-time music generation (web app)β2,679Updated last year
- StyleTTS 2: Towards Human-Level Text-to-Speech through Style Diffusion and Adversarial Training with Large Speech Language Modelsβ6,063Updated last year
- [CVPR 2024] Official repository for "MagicAnimate: Temporally Consistent Human Image Animation using Diffusion Model"β10,877Updated 2 months ago
- Text-to-3D & Image-to-3D & Mesh Exportation with NeRF + Diffusion.β8,769Updated last year
- π€ Diffusers: State-of-the-art diffusion models for image, video, and audio generation in PyTorch.β31,617Updated last week
- Amphion (/Γ¦mΛfaΙͺΙn/) is a toolkit for Audio, Music, and Speech Generation. Its purpose is to support reproducible research and help junioβ¦β9,505Updated 5 months ago
- Implementation of AudioLM, a SOTA Language Modeling Approach to Audio Generation out of Google Research, in Pytorchβ2,603Updated 10 months ago
- The definitive Web UI for local AI, with powerful features and easy setup.β45,459Updated this week
- Build and share delightful machine learning apps, all in Python. π Star to support our work!β40,534Updated last week
- JARVIS, a system to connect LLMs with ML community. Paper: https://arxiv.org/pdf/2303.17580.pdfβ24,449Updated 3 months ago