lzhangbj/ASVA

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/lzhangbj/ASVA)

lzhangbj / ASVA

[ECCV 2024 Oral] Audio-Synchronized Visual Animation

☆60

Alternatives and similar repositories for ASVA

Users that are interested in ASVA are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

mzsun01 / MM-LDM
View on GitHub
☆11Apr 12, 2024Updated 2 years ago
OpenNLPLab / TAVGBench
View on GitHub
Demo page of TAVGBench: Benchmarking Text to Audible-Video Generation
☆15Apr 7, 2025Updated last year
SonyResearch / SVG_baseline
View on GitHub
to release the source code for reproducing the results reported in our paper: https://arxiv.org/abs/2409.17550
☆14Nov 15, 2024Updated last year
v-iashin / Synchformer
View on GitHub
Source code for "Synchformer: Efficient Synchronization from Sparse Cues" (ICASSP 2024)
☆130Sep 15, 2025Updated 10 months ago
ilpoviertola / V-AURA
View on GitHub
The official implementation of V-AURA: Temporally Aligned Audio for Video with Autoregression (ICASSP 2025) (Oral)
☆35Feb 11, 2026Updated 5 months ago
Proton VPN Special Offer - Get 70% off • Ad
Special partner offer. Trusted by over 100 million users worldwide. Tested, Approved and Recommended by Experts.
stoneMo / OneAVM
View on GitHub
Official Codebase of "A Unified Audio-Visual Learning Framework for Localization, Separation, and Recognition" (ICML 2023)
☆12Jun 1, 2023Updated 3 years ago
yuhanghe01 / RiTTA
View on GitHub
Event Relation in Text-to-Audio (TTA) Generation
☆21Feb 26, 2025Updated last year
luosiallen / Diff-Foley
View on GitHub
Diff-Foley: Synchronized Video-to-Audio Synthesis with Latent Diffusion Models
☆206May 29, 2024Updated 2 years ago
zhiwei-zzz / MoScale
View on GitHub
[CVPR 2026] Next-Scale Autoregressive Models for Text-to-Motion Generation
☆16Jun 14, 2026Updated last month
liangsusan-git / AV-NeRF
View on GitHub
[NeurIPS 2023] AV-NeRF: Learning Neural Fields for Real-World Audio-Visual Scene Synthesis
☆36Feb 15, 2024Updated 2 years ago
jaeyeonkim99 / visage
View on GitHub
Official implementation of "ViSAGe: Video-to-Spatial AUdio Generation" (ICLR 2025)
☆47Sep 10, 2025Updated 10 months ago
guyyariv / TempoTokens
View on GitHub
[AAAI 2024] The official PyTorch implementation of "Diverse and Aligned Audio-to-Video Generation via Text-to-Video Model Adaptation"
☆131May 18, 2026Updated 2 months ago
naver-ai / rewas
View on GitHub
Official PyTorch implementation of ReWaS (AAAI'25) "Read, Watch and Scream! Sound Generation from Text and Video"
☆44Dec 13, 2024Updated last year
facebookresearch / real-acoustic-fields
View on GitHub
Real Acoustic Fields An Audio-Visual Room Acoustics Dataset and Benchmark
☆64Aug 29, 2024Updated last year
Deploy to Railway using AI coding agents - Free Credits Offer • Ad
Use Claude Code, Codex, OpenCode, and more. Autonomous software development now has the infrastructure to match with Railway.
ariesssxu / vta-ldm
View on GitHub
☆61Jun 15, 2025Updated last year
pedro-morgado / AVSpatialAlignment
View on GitHub
☆31Jun 14, 2022Updated 4 years ago
pedro-morgado / spatialaudiogen
View on GitHub
Spatial Audio Generation
☆117Mar 24, 2023Updated 3 years ago
open-mmlab / FoleyCrafter
View on GitHub
[IJCV 2026] FoleyCrafter: Bring Silent Videos to Life with Lifelike and Synchronized Sounds. AI拟音大师，给你的无声视频添加生动而且同步的音效 😝
☆658Jun 15, 2026Updated last month
HilaManor / AudioEditingCode
View on GitHub
☆195Nov 19, 2025Updated 8 months ago
ChanganVR / action2sound
View on GitHub
Action2Sound: Ambient-Aware Generation of Action Sounds from Egocentric Videos
☆26Oct 1, 2024Updated last year
maswang32 / hearinganythinganywhere
View on GitHub
Hearing Anything Anywhere Code Release
☆52Nov 11, 2025Updated 8 months ago
YoonjinXD / T-FOLEY
View on GitHub
Implementation of the paper, T-FOLEY: A Controllable Waveform-Domain Diffusion Model for Temporal-Event-Guided Foley Sound Synthesis, ac…
☆34May 25, 2024Updated 2 years ago
facebookresearch / soundvista
View on GitHub
soundvista
☆16Dec 31, 2025Updated 6 months ago
AI Agents on DigitalOcean Gradient AI Platform • Ad
Build production-ready AI agents using customizable tools or access multiple LLMs through a single endpoint. Create custom knowledge bases or connect external data.
pprablanc / ppsrt
View on GitHub
A python algorithm to change the pitch of the voice in real time
☆13Dec 13, 2020Updated 5 years ago
happylittlecat2333 / Auffusion
View on GitHub
Official codes and models of the paper "Auffusion: Leveraging the Power of Diffusion and Large Language Models for Text-to-Audio Generati…
☆194Mar 25, 2024Updated 2 years ago
yzxing87 / Seeing-and-Hearing
View on GitHub
[CVPR 2024] Seeing and Hearing: Open-domain Visual-Audio Generation with Diffusion Latent Aligners
☆155Jul 6, 2024Updated 2 years ago
XYPB / CondFoleyGen
View on GitHub
Official PyTorch implementation of "Conditional Generation of Audio from Video via Foley Analogies".
☆93Dec 8, 2023Updated 2 years ago
snap-research / GenAU
View on GitHub
☆53Mar 24, 2026Updated 4 months ago
kyegomez / Mirasol
View on GitHub
Pytorch Implementation of the Model from "MIRASOL3B: A MULTIMODAL AUTOREGRESSIVE MODEL FOR TIME-ALIGNED AND CONTEXTUAL MODALITIES"
☆26Jan 27, 2025Updated last year
justivanr / art2mus_
View on GitHub
Art2Mus is a system that generates music based on digitized artworks and text by using the AudioLDM2 architecture with an added projectio…
☆20Oct 20, 2025Updated 9 months ago
jianzongwu / Does-Hearing-Help-Seeing
View on GitHub
☆19Dec 3, 2025Updated 7 months ago
ruohaoguo / avis
View on GitHub
[CVPR 2025] 🔥 Official impl. of "Audio-Visual Instance Segmentation".
☆52Jun 5, 2025Updated last year
Managed hosting for WordPress and PHP on Cloudways • Ad
Managed hosting for WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Deploy in minutes on Cloudways by DigitalOcean.
soham97 / mellow
View on GitHub
small audio language model for reasoning
☆88Dec 4, 2025Updated 7 months ago
PeiwenSun2000 / Both-Ears-Wide-Open
View on GitHub
The official repo for Both Ears Wide Open: Towards Language-Driven Spatial Audio Generation
☆65Jul 2, 2025Updated last year
qinghuannn / ChainHOI
View on GitHub
☆15May 1, 2025Updated last year
bytedance / Make-An-Audio-2
View on GitHub
a text-conditional diffusion probabilistic model capable of generating high fidelity audio.
☆197May 29, 2024Updated 2 years ago
LiBingyu01 / FGA-seg
View on GitHub
Fine-Grained Pixel-Text Alignment for Open-Vocabulary Semantic Segmentation
☆16Mar 28, 2026Updated 4 months ago
Jiaxin-Ye / Emo-DNA
View on GitHub
[ACM MM 2023] Official PyTorch implementation of "Emo-DNA: Emotion Decoupling and Alignment Learning for Cross-Corpus Speech Emotion Reco…
☆12Aug 4, 2023Updated 2 years ago
ta012 / SSLAM
View on GitHub
[ICLR 2025] Enhancing Self-Supervised Models with Audio Mixtures for Polyphonic Soundscapes
☆79Oct 8, 2025Updated 9 months ago