jnwnlee / video-foleyLinks
Official implementation of "Video-Foley: Two-Stage Video-To-Sound Generation via Temporal Event Condition For Foley Sound". IEEE TASLP 2025.
☆16Updated 4 months ago
Alternatives and similar repositories for video-foley
Users that are interested in video-foley are comparing it to the libraries listed below
Sorting:
- ☆40Updated 9 months ago
- [CVPR 2025] Official implementation of paper "Prosody-Enhanced Acoustic Pre-training and Acoustic-Disentangled Prosody Adapting for Movie…☆23Updated 7 months ago
- Official Implementation and Dataset of paper - DFADD: The Diffusion and Flow-matching based Audio Deepfake Dataset☆15Updated 9 months ago
- ☆68Updated last month
- ☆24Updated 4 months ago
- [ICASSP 2025] AnCoGen: Analysis, Control and Generation of Speech with a Masked Autoencoder☆12Updated 10 months ago
- Inference code for Interspeech 2025 paper, "LSCodec: Low-Bitrate and Speaker-Decoupled Discrete Speech Codec"☆35Updated 3 months ago
- ☆19Updated last year
- The official implementation of V-AURA: Temporally Aligned Audio for Video with Autoregression (ICASSP 2025) (Oral)☆32Updated last year
- Source code and speech samples for the DSU-AVO paper accepted to INTERSPEECH 2023☆12Updated last year
- ☆47Updated 9 months ago
- PyTorch Implementation of [WMCodec: End-to-End Neural Speech Codec with Deep Watermarking for Authenticity Verification](https://arxiv.or…☆16Updated 6 months ago
- This is the repository for the work "BridgeVoC: Revitalizing Neural Vocoder from a Restoration Perspective".☆61Updated 2 months ago
- TraceableSpeech: Towards Proactively Traceable Text-to-Speech with Watermarking☆21Updated 9 months ago
- Implementation of Multi-Source Music Generation with Latent Diffusion.☆28Updated last year
- [Official Implementation] Acoustic Autoregressive Modeling 🔥☆74Updated last year
- Repo of the paper "Towards Building an End-to-End Multilingual Automatic Lyrics Transcription Model""☆14Updated last year
- [ICASSP2025] Official code for VoiceDiT: Dual-Condition Diffusion Transformer for Environment-Aware Speech Synthesis☆41Updated 9 months ago
- Implementation of the paper, T-FOLEY: A Controllable Waveform-Domain Diffusion Model for Temporal-Event-Guided Foley Sound Synthesis, ac…☆34Updated last year
- ☆11Updated last year
- ☆43Updated last year
- Descript Audio Codec - VAE Variant (.dac-vae): High-Fidelity Audio Compression with Variational Autoencoder☆30Updated 5 months ago
- ☆18Updated 8 months ago
- ☆15Updated last year
- Official Repository of IJCAI 2024 Paper: "BATON: Aligning Text-to-Audio Model with Human Preference Feedback"☆32Updated 10 months ago
- Glow-TTS with Stochastic Duration Predictor and Stochastic Pitch Predictor☆18Updated 2 years ago
- LAFMA: A Latent Flow Matching Model for Text-to-Audio Generation (INTERSPEECH 2024)☆43Updated last year
- [ACM MM24] Official implementation of paper "From Speaker to Dubber: Movie Dubbing with Prosody and Duration Consistency Learning"☆32Updated 8 months ago
- Inference codebase for "Cacophony: An Improved Contrastive Audio-Text Model". Preprint: https://arxiv.org/abs/2402.06986☆48Updated last week
- This repository collects papers related to Speech Tokenizer.☆17Updated last year