facebookresearch / audiocraftLinks
Audiocraft is a library for audio processing and generation with deep learning. It features the state-of-the-art EnCodec audio compressor / tokenizer, along with MusicGen, a simple and controllable music generation LM with textual and melodic conditioning.
☆22,745Updated 9 months ago
Alternatives and similar repositories for audiocraft
Users that are interested in audiocraft are comparing it to the libraries listed below
Sorting:
- 🔊 Text-Prompted Generative Audio Model☆38,787Updated last year
- Implementation of MusicLM, Google's new SOTA model for music generation using attention networks, in Pytorch☆3,285Updated 2 years ago
- An open source implementation of Microsoft's VALL-E X zero-shot TTS model. Demo is available in https://plachtaa.github.io/vallex/☆7,967Updated last year
- Stable diffusion for real-time music generation☆3,840Updated last year
- AudioGPT: Understanding and Generating Speech, Music, Sound, and Talking Head☆10,210Updated last year
- State-of-the-art deep learning based audio codec supporting both mono 24 kHz audio and stereo 48 kHz audio.☆3,853Updated last year
- ☆7,847Updated last year
- Amphion (/æmˈfaɪən/) is a toolkit for Audio, Music, and Speech Generation. Its purpose is to support reproducible research and help junio…☆9,536Updated 6 months ago
- Let us control diffusion models!☆33,406Updated last year
- Inference code for CodeLlama models☆16,367Updated last year
- Community interface for generative AI☆9,042Updated last year
- An unofficial PyTorch implementation of the audio LM VALL-E☆2,992Updated 2 years ago
- Muzic: Music Understanding and Generation with Artificial Intelligence☆4,884Updated last year
- Foundational Models for State-of-the-Art Speech and Text Translation☆11,717Updated last year
- StableLM: Stability AI Language Models☆15,789Updated last year
- [CVPR 2023] SadTalker:Learning Realistic 3D Motion Coefficients for Stylized Audio-Driven Single Image Talking Face Animation☆13,421Updated last year
- AudioLDM: Generate speech, sound effects, music and beyond, with text.☆2,781Updated 5 months ago
- An open platform for training, serving, and evaluating large language models. Release repo for Vicuna and Chatbot Arena.☆39,290Updated 6 months ago
- Text-to-Audio/Music Generation☆2,536Updated last year
- Implementation of AudioLM, a SOTA Language Modeling Approach to Audio Generation out of Google Research, in Pytorch☆2,608Updated 11 months ago
- 🐸💬 - a deep learning toolkit for Text-to-Speech, battle-tested in research and production☆43,837Updated last year
- A multi-voice TTS system trained with an emphasis on quality☆14,730Updated last year
- InstantID: Zero-shot Identity-Preserving Generation in Seconds 🔥☆11,877Updated last year
- Inference code for Llama models☆58,983Updated 10 months ago
- Generative models for conditional audio generation☆3,534Updated 2 months ago
- JAX implementation of OpenAI's Whisper model for up to 70x speed-up on TPU.☆4,648Updated last year
- 🔊 Text-prompted Generative Audio Model - With the ability to clone voices☆3,336Updated 3 months ago
- ImageBind One Embedding Space to Bind Them All☆8,888Updated 3 weeks ago
- StyleTTS 2: Towards Human-Level Text-to-Speech through Style Diffusion and Adversarial Training with Large Speech Language Models☆6,082Updated last year
- OpenAssistant is a chat-based assistant that understands tasks, can interact with third-party systems, and retrieve information dynamical…☆37,497Updated last year