jnwnlee / video-foleyLinks
Official implementation of "Video-Foley: Two-Stage Video-To-Sound Generation via Temporal Event Condition For Foley Sound". IEEE TASLP 2025.
☆16Updated 2 months ago
Alternatives and similar repositories for video-foley
Users that are interested in video-foley are comparing it to the libraries listed below
Sorting:
- ☆41Updated 7 months ago
- The official implementation of V-AURA: Temporally Aligned Audio for Video with Autoregression (ICASSP 2025) (Oral)☆31Updated 10 months ago
- ☆24Updated 2 months ago
- ☆19Updated last year
- [CVPR 2025] Official implementation of paper "Prosody-Enhanced Acoustic Pre-training and Acoustic-Disentangled Prosody Adapting for Movie…☆22Updated 5 months ago
- Explaining audio differences using language☆16Updated 9 months ago
- [ICASSP2025] Official code for VoiceDiT: Dual-Condition Diffusion Transformer for Environment-Aware Speech Synthesis☆24Updated 7 months ago
- Art2Mus is a system that generates music based on digitized artworks and text by using the AudioLDM2 architecture with an added projectio…☆19Updated last month
- Official Implementation and Dataset of paper - DFADD: The Diffusion and Flow-matching based Audio Deepfake Dataset☆15Updated 7 months ago
- Inference code for Interspeech 2025 paper, "LSCodec: Low-Bitrate and Speaker-Decoupled Discrete Speech Codec"☆33Updated last month
- ☆44Updated 10 months ago
- [Official Implementation] Acoustic Autoregressive Modeling 🔥☆73Updated last year
- ☆59Updated last month
- TraceableSpeech: Towards Proactively Traceable Text-to-Speech with Watermarking☆20Updated 7 months ago
- ☆43Updated 7 months ago
- [ACM MM24] Official implementation of paper "From Speaker to Dubber: Movie Dubbing with Prosody and Duration Consistency Learning"☆31Updated 6 months ago
- Inference codebase for "Cacophony: An Improved Contrastive Audio-Text Model". Preprint: https://arxiv.org/abs/2402.06986☆48Updated last year
- Implementation of the paper, T-FOLEY: A Controllable Waveform-Domain Diffusion Model for Temporal-Event-Guided Foley Sound Synthesis, ac…☆34Updated last year
- Implementation of Frieren: Efficient Video-to-Audio Generation Network with Rectified Flow Matching (NeurIPS'24)☆54Updated 7 months ago
- ☆35Updated 6 months ago
- ☆10Updated last year
- Official PyTorch implementation of "Conditional Generation of Audio from Video via Foley Analogies".☆90Updated last year
- [CVPR 2024] AV2AV: Direct Audio-Visual Speech to Audio-Visual Speech Translation with Unified Audio-Visual Speech Representation☆42Updated last year
- [ICLR 2025] Enhancing Self-Supervised Models with Audio Mixtures for Polyphonic Soundscapes☆53Updated last month
- Video Background Music Generation Using Unpaired Audio-Visual Data☆29Updated last year
- Pytorch implementation for “V2C: Visual Voice Cloning”☆32Updated 2 years ago
- Code for ICASSP 2024 Paper: RECAP: Retrieval-Augmented Audio Captioning☆15Updated last year
- Official implementation of "ViSAGe: Video-to-Spatial AUdio Generation" (ICLR 2025)☆38Updated 2 months ago
- Efficient Training for Multilingual Visual Speech Recognition: Pre-training with Discretized Visual Speech Representation (ACM MM 2024)☆19Updated 8 months ago
- This repository collects papers related to Speech Tokenizer.☆17Updated last year