facebookresearch / audiocraft
Audiocraft is a library for audio processing and generation with deep learning. It features the state-of-the-art EnCodec audio compressor / tokenizer, along with MusicGen, a simple and controllable music generation LM with textual and melodic conditioning.
☆21,943Updated last month
Alternatives and similar repositories for audiocraft:
Users that are interested in audiocraft are comparing it to the libraries listed below
- 🔊 Text-Prompted Generative Audio Model☆37,722Updated 8 months ago
- Foundational Models for State-of-the-Art Speech and Text Translation☆11,509Updated 5 months ago
- AudioGPT: Understanding and Generating Speech, Music, Sound, and Talking Head☆10,143Updated 10 months ago
- ☆7,805Updated last year
- Stable diffusion for real-time music generation☆3,673Updated 9 months ago
- StableLM: Stability AI Language Models☆15,832Updated last year
- Community interface for generative AI☆8,973Updated last year
- An open source implementation of Microsoft's VALL-E X zero-shot TTS model. Demo is available in https://plachtaa.github.io/vallex/☆7,864Updated last year
- Implementation of MusicLM, Google's new SOTA model for music generation using attention networks, in Pytorch☆3,248Updated last year
- 🐸💬 - a deep learning toolkit for Text-to-Speech, battle-tested in research and production☆39,821Updated 8 months ago
- Zero-Shot Speech Editing and Text-to-Speech in the Wild☆8,255Updated last month
- An open platform for training, serving, and evaluating large language models. Release repo for Vicuna and Chatbot Arena.☆38,535Updated 3 weeks ago
- Let us control diffusion models!☆32,245Updated last year
- Code and documentation to train Stanford's Alpaca models, and generate the data.☆29,980Updated 9 months ago
- AudioLDM: Generate speech, sound effects, music and beyond, with text.☆2,637Updated 5 months ago
- Generate 3D objects conditioned on text or images☆11,893Updated 10 months ago
- StyleTTS 2: Towards Human-Level Text-to-Speech through Style Diffusion and Adversarial Training with Large Speech Language Models☆5,714Updated 9 months ago
- Open-sourced codes for MiniGPT-4 and MiniGPT-v2 (https://minigpt-4.github.io, https://minigpt-v2.github.io/)☆25,662Updated 8 months ago
- LLMs build upon Evol Insturct: WizardLM, WizardCoder, WizardMath☆9,393Updated 9 months ago
- [CVPR 2023] SadTalker:Learning Realistic 3D Motion Coefficients for Stylized Audio-Driven Single Image Talking Face Animation☆12,701Updated 10 months ago
- A latent text-to-image diffusion model☆70,574Updated 10 months ago
- Invoke is a leading creative engine for Stable Diffusion models, empowering professionals, artists, and enthusiasts to generate and creat…☆25,036Updated this week
- ChatRWKV is like ChatGPT but powered by RWKV (100% RNN) language model, and open source.☆9,481Updated this week
- OpenAssistant is a chat-based assistant that understands tasks, can interact with third-party systems, and retrieve information dynamical…☆37,344Updated 8 months ago
- Inference code for Llama models☆58,197Updated 3 months ago
- High-Resolution Image Synthesis with Latent Diffusion Models☆40,917Updated 7 months ago
- Universal LLM Deployment Engine with ML Compilation☆20,579Updated last week
- Stable Diffusion with Core ML on Apple Silicon☆17,285Updated 3 months ago
- Making large AI models cheaper, faster and more accessible☆40,863Updated this week
- Generative Models by Stability AI☆25,827Updated last month