rxtan2/AVSeT

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/rxtan2/AVSeT)

rxtan2 / AVSeT

☆17

Alternatives and similar repositories for AVSeT

Users that are interested in AVSeT are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

stoneMo / OneAVM
View on GitHub
Official Codebase of "A Unified Audio-Visual Learning Framework for Localization, Separation, and Recognition" (ICML 2023)
☆12Jun 1, 2023Updated 3 years ago
arijitray1993 / CARLA_tutorial
View on GitHub
☆21Aug 16, 2021Updated 4 years ago
changzheng123 / L-CoDer
View on GitHub
Implementation for for "L-CoDer: Language-based Colorization with Color-object Decoupling Transformer"
☆13Jan 20, 2024Updated 2 years ago
Tinglok / avstyle
View on GitHub
Codebase for the Paper: Learning Visual Styles from Audio-Visual Associations (ECCV 2022, in PyTorch)
☆15Jan 26, 2023Updated 3 years ago
stoneMo / EZ-VSL
View on GitHub
Official Codebase of "Localizing Visual Sounds the Easy Way" (ECCV 2022)
☆42Oct 2, 2022Updated 3 years ago
Wordpress hosting with auto-scaling - Free Trial Offer • Ad
Fully Managed hosting for WordPress and WooCommerce businesses that need reliable, auto-scalable performance. Cloudways SafeUpdates now available.
HS-YN / PanoAVQA
View on GitHub
Official repository of PanoAVQA: Grounded Audio-Visual Question Answering in 360° Videos (ICCV 2021)
☆16Oct 12, 2021Updated 4 years ago
YapengTian / AVE-ECCV18
View on GitHub
Audio-Visual Event Localization in Unconstrained Videos, ECCV 2018
☆210Apr 3, 2021Updated 5 years ago
ExplainableML / AVCA-GZSL
View on GitHub
This repository contains the code for our CVPR 2022 paper on "Audio-visual Generalised Zero-shot Learning with Cross-modal Attention and …
☆42Nov 29, 2022Updated 3 years ago
YapengTian / CCOL-CVPR21
View on GitHub
Cyclic Co-Learning of Sounding Object Visual Grounding and Sound Separation
☆26Nov 24, 2021Updated 4 years ago
neuralchen / Bivolution
View on GitHub
Accepted by AAAI2022
☆21Apr 10, 2022Updated 4 years ago
AmeenAli / VideoMatch
View on GitHub
☆14Jan 5, 2022Updated 4 years ago
arijitray1993 / COLA
View on GitHub
COLA: Evaluate how well your vision-language model can Compose Objects Localized with Attributes!
☆25May 14, 2026Updated 2 months ago
MadryLab / D3M
View on GitHub
Debiasing Through Data Attribution
☆13May 23, 2024Updated 2 years ago
WikiChao / VisAH
View on GitHub
[CVPR 2025] Pytorch implementation of the paper "Learning to Highlight Audio by Watching Movies"
☆15Oct 1, 2025Updated 9 months ago
Virtual machines for every use case on DigitalOcean • Ad
Get dependable uptime with 99.99% SLA, simple security tools, and predictable monthly pricing with DigitalOcean's virtual machines, called Droplets.
roudimit / MUSIC_dataset
View on GitHub
MUSIC Dataset from The Sound of Pixels (ECCV '18)
☆137Aug 12, 2022Updated 3 years ago
StevenHickson / CreateNormals
View on GitHub
☆11Nov 22, 2019Updated 6 years ago
yzyouzhang / Audio_Research_in_US
View on GitHub
Audio Research in US. US-based professors who work on audio (music, speech, acoustics). For students who would like to apply for RA, PhD,…
☆27Feb 27, 2026Updated 4 months ago
rxtan2 / DIDAN
View on GitHub
☆20Jul 30, 2024Updated last year
jhuang448 / MultilingualALT
View on GitHub
Repo of the paper "Towards Building an End-to-End Multilingual Automatic Lyrics Transcription Model""
☆15Jun 28, 2024Updated 2 years ago
sainathadapa / dcase2019-task5-urban-sound-tagging
View on GitHub
1st place solution to the DCASE 2019 - Task 5 - Urban Sound Tagging
☆30Mar 19, 2021Updated 5 years ago
xmos / lib_agc
View on GitHub
Automatic gain control library
☆15Jul 13, 2024Updated 2 years ago
IFICL / SLfM
View on GitHub
Official code for the paper: [ICCV2023] Sound Localization from Motion: Jointly Learning Sound Direction and Camera Rotation
☆43Updated this week
carlthome / pmqd
View on GitHub
Perceived Music Quality Dataset
☆12Jul 1, 2024Updated 2 years ago
GPU virtual machines on DigitalOcean Gradient AI • Ad
Get to production fast with high-performance AMD and NVIDIA GPUs you can spin up in seconds. The definition of operational simplicity.
yunzqq / DeepMMSE
View on GitHub
DeepMMSE: A Deep Learning Approach to MMSE-based Noise Power Spectral Density Estimation
☆12Jun 4, 2020Updated 6 years ago
ttslr / MonTTS
View on GitHub
☆16Dec 23, 2021Updated 4 years ago
haoxiangsnr / llm-tse
View on GitHub
Typing to Listen at the Cocktail Party: Text-Guided Target Speaker Extraction (LLM-TSE)
☆43Oct 13, 2023Updated 2 years ago
joannahong / AV-RelScore
View on GitHub
Audio-Visual Corruption Modeling of our paper "Watch or Listen: Robust Audio-Visual Speech Recognition with Visual Corruption Modeling an…
☆35Jun 20, 2023Updated 3 years ago
jinxiang-liu / anno-free-AVS
View on GitHub
Official code for WACV 2024 paper, "Annotation-free Audio-Visual Segmentation"
☆38Oct 11, 2024Updated last year
hyunwoongko / stop-sequencer
View on GitHub
Implementation of stop sequencer for Huggingface Transformers
☆16Jun 6, 2023Updated 3 years ago
WikiChao / DAVIS
View on GitHub
[🏆 IJCV 2025 & ACCV 2024 Best Paper Honorable Mention] Official pytorch implementation of the paper "High-Quality Visually-Guided Sound …
☆33Mar 30, 2026Updated 3 months ago
Devin-Taylor / MultiAug
View on GitHub
Multi-modal data augmentation for machine learning
☆16Jun 4, 2019Updated 7 years ago
chzhang18 / StereoEchoes
View on GitHub
Stereo Depth Estimation with Echoes at ECCV 2022
☆10Sep 20, 2022Updated 3 years ago
Deploy to Railway using AI coding agents - Free Credits Offer • Ad
Use Claude Code, Codex, OpenCode, and more. Autonomous software development now has the infrastructure to match with Railway.
cyh-0 / BoMD
View on GitHub
Official code for "BoMD: Bag of Multi-label Descriptors for Noisy Chest X-ray Classification"
☆26Apr 11, 2024Updated 2 years ago
iskwak / DetetctingActionStarts
View on GitHub
Detecting the starting frame of actions in video.
☆10Feb 12, 2020Updated 6 years ago
pedro-morgado / AVSpatialAlignment
View on GitHub
☆31Jun 14, 2022Updated 4 years ago
zwx8981 / ADTH-QA
View on GitHub
☆19Sep 5, 2024Updated last year
kuai-lab / soundini-official
View on GitHub
We are committing code.
☆44May 18, 2023Updated 3 years ago
Pliploop / GDRetriever
View on GitHub
Official implementation of the paper - GD-Retriever: Controllable generative text-music retrieval with diffusion models (Accepted at ISMI…
☆19Sep 25, 2025Updated 9 months ago
apple-yinhan / TQ-SED
View on GitHub
☆24Mar 19, 2025Updated last year