jnwnlee/video-foley

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/jnwnlee/video-foley)

jnwnlee / video-foley

Official implementation of "Video-Foley: Two-Stage Video-To-Sound Generation via Temporal Event Condition For Foley Sound". IEEE TASLP 2025.

☆19

Alternatives and similar repositories for video-foley

Users that are interested in video-foley are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

ilpoviertola / V-AURA
View on GitHub
The official implementation of V-AURA: Temporally Aligned Audio for Video with Autoregression (ICASSP 2025) (Oral)
☆35Feb 11, 2026Updated 5 months ago
ChanganVR / action2sound
View on GitHub
Action2Sound: Ambient-Aware Generation of Action Sounds from Egocentric Videos
☆26Oct 1, 2024Updated last year
XYPB / CondFoleyGen
View on GitHub
Official PyTorch implementation of "Conditional Generation of Audio from Video via Foley Analogies".
☆93Dec 8, 2023Updated 2 years ago
ispamm / FolAI
View on GitHub
Stable-V2A: Synthesis of Synchronized Sound Effect with Temporal and Semantic Controls
☆18May 27, 2025Updated last year
Apple-jun / FilmComposer
View on GitHub
Music production for silent film clips.
☆34Apr 30, 2025Updated last year
Managed Database hosting by DigitalOcean • Ad
PostgreSQL, MySQL, MongoDB, Kafka, Valkey, and OpenSearch available. Automatically scale up storage and focus on building your apps.
gwh22 / LAFMA
View on GitHub
LAFMA: A Latent Flow Matching Model for Text-to-Audio Generation (INTERSPEECH 2024)
☆44Jun 13, 2024Updated 2 years ago
zeyuxie29 / PicoAudio
View on GitHub
☆45Jan 13, 2025Updated last year
RBenita / DIFFAR
View on GitHub
Denoising Diffusion Autoregressive Model for Raw Speech Waveform Generation
☆32Mar 8, 2024Updated 2 years ago
ldzhangyx / MusicMagus
View on GitHub
The official implementation of the IJCAI 2024 paper "MusicMagus: Zero-Shot Text-to-Music Editing via Diffusion Models".
☆49Sep 11, 2024Updated last year
PeiwenSun2000 / Both-Ears-Wide-Open
View on GitHub
The official repo for Both Ears Wide Open: Towards Language-Driven Spatial Audio Generation
☆65Jul 2, 2025Updated last year
ZeyueT / VidMuse
View on GitHub
[CVPR 2025] Repository of VidMuse
☆140Jun 7, 2025Updated last year
juhayna-zh / BSRNN-speech-preprocess
View on GitHub
A solution to denoising and separating for two-speaker-mixed noisy speech, using a BSRNN inspired network.
☆15Aug 22, 2023Updated 2 years ago
james-oldfield / MxD
View on GitHub
[NeurIPS'25] Towards Interpretability Without Sacrifice: Faithful Dense Layer Decomposition with Mixture of Decoders
☆16May 28, 2025Updated last year
yqcai888 / easy_dcase_task1
View on GitHub
This repository provides an easy way to train your models on the datasets of DCASE task 1.
☆20May 28, 2025Updated last year
1-Click AI Models by DigitalOcean Gradient • Ad
Deploy popular AI models on DigitalOcean Gradient GPU virtual machines with just a single click. Zero configuration with optimized deployments.
MatchLab-Imperial / POMA-3D
View on GitHub
POMA-3D: The Point Map Way to 3D Scene Understanding.
☆16Nov 9, 2025Updated 8 months ago
egruttadauria98 / SSpaVAlDo
View on GitHub
☆37Jan 6, 2026Updated 6 months ago
SonyResearch / SVG_baseline
View on GitHub
to release the source code for reproducing the results reported in our paper: https://arxiv.org/abs/2409.17550
☆14Nov 15, 2024Updated last year
xxayt / MGSV
View on GitHub
[ICCV 2025] This repo is the official implementation of "Music Grounding by Short Video"
☆27Sep 9, 2025Updated 10 months ago
wilkinghoff / DCASE2023_task2
View on GitHub
Submission for task 2 "First-Shot Unsupervised Anomalous Sound Detection for Machine Condition Monitoring" of the DCASE challenge 2023 (h…
☆18May 22, 2023Updated 3 years ago
roymiles / ITRD
View on GitHub
[BMVC 2022] Information Theoretic Representation Distillation
☆19Oct 6, 2023Updated 2 years ago
Ego4DSounds / Ego4DSounds
View on GitHub
Ego4DSounds: A diverse egocentric dataset with high action-audio correspondence
☆21Jun 14, 2024Updated 2 years ago
l3das / L3DAS23
View on GitHub
Official repository supporting the L3DAS23 IEEE ICASSP Grand Challenge
☆16Feb 10, 2023Updated 3 years ago
voidful / asrp
View on GitHub
ASR text preprocessing utility
☆21Aug 5, 2024Updated last year
Wordpress hosting with auto-scaling - Free Trial Offer • Ad
Fully Managed hosting for WordPress and WooCommerce businesses that need reliable, auto-scalable performance. Cloudways SafeUpdates now available.
wjc2830 / MelQCD-main
View on GitHub
☆32Mar 14, 2025Updated last year
RoySheffer / im2wav
View on GitHub
Official implementation of the pipeline presented in I hear your true colors: Image Guided Audio Generation
☆125Jan 18, 2023Updated 3 years ago
shivammehta25 / BetterFastSpeech2
View on GitHub
Just another FastSpeech 2 but cleaner code :)
☆29Jun 28, 2024Updated 2 years ago
roymiles / Simple-Recipe-Distillation
View on GitHub
[AAAI 2024] Understanding the Role of the Projector in Knowledge Distillation
☆20Feb 13, 2024Updated 2 years ago
Evanwu1125 / LiteCoT
View on GitHub
☆17Jun 10, 2025Updated last year
luosiallen / Diff-Foley
View on GitHub
Diff-Foley: Synchronized Video-to-Audio Synthesis with Latent Diffusion Models
☆206May 29, 2024Updated 2 years ago
cuhealthybrains / MT-LLM
View on GitHub
The implementation for "Large Language Model Can Transcribe Speech in Multi-Talker Scenarios with Versatile Instructions"
☆51Apr 7, 2025Updated last year
FreedomIntelligence / MTalk-Bench
View on GitHub
MTalk-Bench: Evaluating Speech-to-Speech Models in Multi-Turn Dialogues via Arena-style and Rubrics Protocols
☆20Nov 19, 2025Updated 8 months ago
poloclub / complicit-splat
View on GitHub
3D Gaussian Splat Easily Attacked to Cause Harm
☆13Aug 5, 2025Updated 11 months ago
Deploy open-source AI quickly and easily - Special Bonus Offer • Ad
Runpod Hub is built for open source. One-click deployment and autoscaling endpoints without provisioning your own infrastructure.
kyegomez / Mirasol
View on GitHub
Pytorch Implementation of the Model from "MIRASOL3B: A MULTIMODAL AUTOREGRESSIVE MODEL FOR TIME-ALIGNED AND CONTEXTUAL MODALITIES"
☆26Jan 27, 2025Updated last year
Bezdarnost / awesome-super-resolution
View on GitHub
collection with description of super-resolution related papers, repositories, datasets, loss functions and etc.
☆11Dec 12, 2023Updated 2 years ago
roymiles / VeLoRA
View on GitHub
[NeurIPS 2024] VeLoRA : Memory Efficient Training using Rank-1 Sub-Token Projections
☆22Oct 15, 2024Updated last year
xlvector / abcmidi
View on GitHub
abc2midi is a program that converts an abc music notation file to a MIDI file.
☆47Jun 1, 2016Updated 10 years ago
ssahoo11742 / Scopul
View on GitHub
A python package to extract information from MIDI files
☆14Aug 30, 2023Updated 2 years ago
pengyizhou / FD-Bench
View on GitHub
☆25Aug 14, 2025Updated 11 months ago
codename0og / RVC_Onnx_Infer
View on GitHub
RVC Onnx Infer- Upgraded and simplified-ish
☆25May 9, 2024Updated 2 years ago