PeihaoChen/regnet

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/PeihaoChen/regnet)

PeihaoChen / regnet

Official PyTorch implementation of the TIP paper "Generating Visually Aligned Sound from Videos" and the corresponding Visually Aligned Sound (VAS) dataset.

☆53

Alternatives and similar repositories for regnet

Users that are interested in regnet are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

v-iashin / SpecVQGAN
View on GitHub
Source code for "Taming Visually Guided Sound Generation" (Oral at the BMVC 2021)
☆372Jul 12, 2024Updated 2 years ago
v-iashin / SparseSync
View on GitHub
Source code for "Sparse in Space and Time: Audio-visual Synchronisation with Trainable Selectors." (Spotlight at the BMVC 2022)
☆56Jan 29, 2024Updated 2 years ago
ilpoviertola / V-AURA
View on GitHub
The official implementation of V-AURA: Temporally Aligned Audio for Video with Autoregression (ICASSP 2025) (Oral)
☆35Feb 11, 2026Updated 5 months ago
shlizee / Audeo
View on GitHub
☆31Feb 4, 2021Updated 5 years ago
descriptinc / lyrebird-wav2clip
View on GitHub
Official implementation of the paper WAV2CLIP: LEARNING ROBUST AUDIO REPRESENTATIONS FROM CLIP
☆359Feb 15, 2022Updated 4 years ago
End-to-end encrypted cloud storage - Proton Drive • Ad
Special offer: 40% Off Yearly / 80% Off First Month. Protect your most important files, photos, and documents from prying eyes.
RoySheffer / im2wav
View on GitHub
Official implementation of the pipeline presented in I hear your true colors: Image Guided Audio Generation
☆125Jan 18, 2023Updated 3 years ago
MorenoLaQuatra / audiocaps-download
View on GitHub
This package aims at simplifying the download of the AudioCaps dataset.
☆35Dec 1, 2023Updated 2 years ago
yagyapandeya / Supervised-Music-Video-Emotion-Classification
View on GitHub
The extended and verified music video emotion analysis dataset for data driven algorithm.
☆18Aug 9, 2021Updated 4 years ago
pedro-morgado / AVSpatialAlignment
View on GitHub
☆31Jun 14, 2022Updated 4 years ago
chuangg / Foley-Music
View on GitHub
PyTorch implementation of ECCV 2020 paper "Foley Music: Learning to Generate Music from Videos "
☆39Dec 15, 2020Updated 5 years ago
lxa9867 / QSD
View on GitHub
[CVPR 2024] "Towards Robust Audiovisual Segmentation in Complex Environments with Quantization-based Semantic Decomposition"
☆12Feb 27, 2024Updated 2 years ago
yytzsy / SMCG
View on GitHub
Code for the paper "Controllable Video Captioning with an Exemplar Sentence"
☆12Apr 14, 2021Updated 5 years ago
ExplainableML / AVCA-GZSL
View on GitHub
This repository contains the code for our CVPR 2022 paper on "Audio-visual Generalised Zero-shot Learning with Cross-modal Attention and …
☆42Nov 29, 2022Updated 3 years ago
krantiparida / awesome-audio-visual
View on GitHub
A curated list of different papers and datasets in various areas of audio-visual processing
☆775Jan 30, 2024Updated 2 years ago
Deploy open-source AI quickly and easily - Special Bonus Offer • Ad
Runpod Hub is built for open source. One-click deployment and autoscaling endpoints without provisioning your own infrastructure.
qiuqiangkong / sampleRNN_acoustic_scene_generation
View on GitHub
☆14Apr 18, 2019Updated 7 years ago
audio-captioning / caption-evaluation-tools
View on GitHub
Tools for the evaluation of audio captioning.
☆19May 23, 2020Updated 6 years ago
DTaoo / Discriminative-Sounding-Objects-Localization
View on GitHub
Code for Discriminative Sounding Objects Localization (NeurIPS 2020)
☆61Jan 19, 2022Updated 4 years ago
FloretCat / CMRAN
View on GitHub
Cross-Modal Relation-Aware Networks for Audio-Visual Event Localization， ACM MM 2020
☆33Nov 6, 2020Updated 5 years ago
jamezilla / ofxQTVideoSaver
View on GitHub
A video saver using the QuickTime API for openframeworks.
☆16Sep 11, 2012Updated 13 years ago
WikiChao / VisAH
View on GitHub
[CVPR 2025] Pytorch implementation of the paper "Learning to Highlight Audio by Watching Movies"
☆15Oct 1, 2025Updated 9 months ago
OpenNLPLab / FNAC_AVL
View on GitHub
[CVPR 2023] Official implementation of our paper - Learning Audio-Visual Source Localization via False Negative Aware Contrastive Learnin…
☆29Apr 10, 2023Updated 3 years ago
SheldonTsui / SepStereo_ECCV2020
View on GitHub
Codebase for the paper "Sep-Stereo: Visually Guided Stereophonic Audio Generation by Associating Source Separation" (ECCV2020)
☆72Oct 20, 2020Updated 5 years ago
Huage001 / Artistic-Video-Partial-Conv-Depth-Loss
View on GitHub
This is pytorch implementation of paper Stable Video Style Transfer Based on Partial Convolution with Depth-Aware Supervision.
☆13Aug 5, 2020Updated 5 years ago
AI Agents on DigitalOcean Gradient AI Platform • Ad
Build production-ready AI agents using customizable tools or access multiple LLMs through a single endpoint. Create custom knowledge bases or connect external data.
ardasnck / learning_to_localize_sound_source
View on GitHub
Codebase and Dataset for the paper: Learning to Localize Sound Source in Visual Scenes
☆102Dec 4, 2024Updated last year
WikiChao / DAVIS
View on GitHub
[🏆 IJCV 2025 & ACCV 2024 Best Paper Honorable Mention] Official pytorch implementation of the paper "High-Quality Visually-Guided Sound …
☆33Mar 30, 2026Updated 3 months ago
naver-ai / rewas
View on GitHub
Official PyTorch implementation of ReWaS (AAAI'25) "Read, Watch and Scream! Sound Generation from Text and Video"
☆44Dec 13, 2024Updated last year
spkgyk / TDFNet
View on GitHub
Official code release for "TDFNet: An Efficient Audio-Visual Speech Separation Model with Top-down Fusion", accepted ICIST 2023
☆14Mar 17, 2024Updated 2 years ago
IFICL / SLfM
View on GitHub
Official code for the paper: [ICCV2023] Sound Localization from Motion: Jointly Learning Sound Direction and Camera Rotation
☆43Updated this week
AI-secure / Characterizing-Audio-Adversarial-Examples-using-Temporal-Dependency
View on GitHub
ICLR 2019 Paper, "Characterizing Audio Adversarial Examples using Temporal Dependency".
☆11Apr 3, 2019Updated 7 years ago
GeWu-Lab / awesome-audiovisual-learning
View on GitHub
A curated list of audio-visual learning methods and datasets.
☆288Dec 3, 2024Updated last year
dendisuhubdy / ADMMGLA
View on GitHub
Griffin-Lim Like Phase Recovery via Alternating Direction Method of Multipliers (Yoshiki Masuyama et al., 2018)
☆13Dec 17, 2018Updated 7 years ago
speedyseal / audiosetdl
View on GitHub
Scripts for download AudioSet
☆89Nov 7, 2017Updated 8 years ago
AI Agents on DigitalOcean Gradient AI Platform • Ad
Build production-ready AI agents using customizable tools or access multiple LLMs through a single endpoint. Create custom knowledge bases or connect external data.
HS-YN / PanoAVQA
View on GitHub
Official repository of PanoAVQA: Grounded Audio-Visual Question Answering in 360° Videos (ICCV 2021)
☆16Oct 12, 2021Updated 4 years ago
GeWu-Lab / MMCosine_ICASSP23
View on GitHub
The code repo for ICASSP 2023 Paper "MMCosine: Multi-Modal Cosine Loss Towards Balanced Audio-Visual Fine-Grained Learning"
☆26May 18, 2023Updated 3 years ago
soham97 / mellow
View on GitHub
small audio language model for reasoning
☆88Dec 4, 2025Updated 7 months ago
luosiallen / Diff-Foley
View on GitHub
Diff-Foley: Synchronized Video-to-Audio Synthesis with Latent Diffusion Models
☆205May 29, 2024Updated 2 years ago
khdlr / SoundingEarth
View on GitHub
Self-supervised Audiovisual Representation Learning for Remote Sensing Data
☆34May 22, 2023Updated 3 years ago
BriansIDP / AudioVisualLLM
View on GitHub
☆19May 19, 2024Updated 2 years ago
akoepke / audio-retrieval-benchmark
View on GitHub
Code for "Audio Retrieval with Natural Language Queries: A Benchmark Study", Transactions on Multimedia 2022
☆54Jul 16, 2025Updated last year