yannqi / Draw-an-Audio-CodeLinks
Official code of the paper: Draw an Audio: Leveraging Multi-Instruction for Video-to-Audio Synthesis.
☆46Updated 8 months ago
Alternatives and similar repositories for Draw-an-Audio-Code
Users that are interested in Draw-an-Audio-Code are comparing it to the libraries listed below
Sorting:
- ☆59Updated 10 months ago
- Music production for silent film clips.☆25Updated last month
- ☆20Updated last year
- ☆70Updated last month
- a text-conditional diffusion probabilistic model capable of generating high fidelity audio.☆164Updated last year
- Official Repo for MoCha Towards Movie-Grade Talking Character Synthesis☆26Updated 2 weeks ago
- ☆67Updated 2 months ago
- [AAAI 2025] VQTalker: Towards Multilingual Talking Avatars through Facial Motion Tokenization☆49Updated 5 months ago
- Diff-Foley: Synchronized Video-to-Audio Synthesis with Latent Diffusion Models☆186Updated last year
- An official implementation of SwapAnyone.☆62Updated 2 months ago
- Official PyTorch implementation of "Conditional Generation of Audio from Video via Foley Analogies".☆86Updated last year
- Anim-400K: A dataset designed from the ground up for automated dubbing of video☆106Updated 11 months ago
- ☆15Updated 2 months ago
- Make-An-Audio-3: Transforming Text/Video into Audio via Flow-based Large Diffusion Transformers☆98Updated 2 weeks ago
- ☆12Updated 2 months ago
- Official codes and models of the paper "Auffusion: Leveraging the Power of Diffusion and Large Language Models for Text-to-Audio Generati…☆182Updated last year
- ☆75Updated last year
- ☆170Updated 5 months ago
- An open source community implementation of the model from the paper: "Movie Gen: A Cast of Media Foundation Models". Join our community …☆60Updated last week
- Paper: "From Text to Pose to Image: Improving Diffusion Model Control and Quality"☆51Updated 6 months ago
- [AAAI 2024] V2A-Mapper: A Lightweight Solution for Vision-to-Audio Generation by Connecting Foundation Models☆25Updated last year
- ☆79Updated 3 months ago
- ☆62Updated 10 months ago
- [ICML 2025] SongGen: A Single Stage Auto-regressive Transformer for Text-to-Song Generation☆227Updated 2 months ago
- [ICASSP 2024] DiffDub: Person-generic visual dubbing using inpainting renderer with diffusion auto-encoder☆63Updated 10 months ago
- The official PyTorch implementation for Improving Long-Text Alignment for Text-to-Image Diffusion Models (LongAlign)☆73Updated last month
- LVAS-Agent Code Base☆17Updated last month
- Animatediff implementation. Includes a ControlNet pipeline.☆18Updated last year
- ☆33Updated last month
- ☆22Updated 2 months ago