swimmiing/ACL-SSL

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/swimmiing/ACL-SSL)

swimmiing / ACL-SSL

Repository of the IJCV'26 & WACV'24 paper

☆34

Alternatives and similar repositories for ACL-SSL

Users that are interested in ACL-SSL are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

stoneMo / EZ-VSL
View on GitHub
Official Codebase of "Localizing Visual Sounds the Easy Way" (ECCV 2022)
☆41Oct 2, 2022Updated 3 years ago
ardasnck / learning_to_localize_sound_source
View on GitHub
Codebase and Dataset for the paper: Learning to Localize Sound Source in Visual Scenes
☆102Dec 4, 2024Updated last year
DTaoo / DMC
View on GitHub
Code for Deep Multimodal Clustering for Unsupervised Audiovisual Learning (CVPR2019)
☆15May 27, 2020Updated 6 years ago
nttrd-mdlab / wearable-seld-dataset
View on GitHub
☆10Feb 18, 2022Updated 4 years ago
ruohaoguo / avis
View on GitHub
[CVPR 2025] 🔥 Official impl. of "Audio-Visual Instance Segmentation".
☆49Jun 5, 2025Updated last year
Wordpress hosting with auto-scaling - Free Trial Offer • Ad
Fully Managed hosting for WordPress and WooCommerce businesses that need reliable, auto-scalable performance. Cloudways SafeUpdates now available.
hxixixh / mix-and-localize
View on GitHub
☆23Mar 20, 2024Updated 2 years ago
OpenNLPLab / FNAC_AVL
View on GitHub
[CVPR 2023] Official implementation of our paper - Learning Audio-Visual Source Localization via False Negative Aware Contrastive Learnin…
☆28Apr 10, 2023Updated 3 years ago
stoneMo / OneAVM
View on GitHub
Official Codebase of "A Unified Audio-Visual Learning Framework for Localization, Separation, and Recognition" (ICML 2023)
☆12Jun 1, 2023Updated 3 years ago
HS-YN / PanoAVQA
View on GitHub
Official repository of PanoAVQA: Grounded Audio-Visual Question Answering in 360° Videos (ICCV 2021)
☆16Oct 12, 2021Updated 4 years ago
kaist-ami / Sound2Scene
View on GitHub
☆42Apr 14, 2025Updated last year
kaistmm / SSLalignment
View on GitHub
☆37May 28, 2025Updated last year
ruohaoguo / ovavss
View on GitHub
Official Implementation of "Open-Vocabulary Audio-Visual Semantic Segmentation" [ACM MM 2024 Oral].
☆37Nov 2, 2024Updated last year
GeWu-Lab / Generalizable-Audio-Visual-Segmentation
View on GitHub
Official repository of "Prompting Segmentation with Sound is Generalizable Audio-Visual Source Localizer", AAAI 2024
☆28Mar 14, 2026Updated 4 months ago
GLJS / AudioToolAgent
View on GitHub
GitHub repository for AudioToolAgent
☆20Feb 13, 2026Updated 5 months ago
Deploy to Railway using AI coding agents - Free Credits Offer • Ad
Use Claude Code, Codex, OpenCode, and more. Autonomous software development now has the infrastructure to match with Railway.
pprablanc / ppsrt
View on GitHub
A python algorithm to change the pitch of the voice in real time
☆13Dec 13, 2020Updated 5 years ago
mclemcrew / MixAssist
View on GitHub
This repository contains the official "LLM-as-a-Judge" evaluation scripts for the MixAssist project, as detailed in our paper, "MixAssist…
☆16Dec 29, 2025Updated 6 months ago
stoneMo / AVGN
View on GitHub
Official implementation for AVGN
☆41Mar 24, 2023Updated 3 years ago
roudimit / Omni-R1
View on GitHub
[ASRU 2025] Omni-R1: Do You Really Need Audio to Fine-Tune Your Audio LLM?
☆47Nov 21, 2025Updated 7 months ago
yuhanghe01 / Sound3DVDet
View on GitHub
Code for WACV24 work for multiview acoustic-visual detection
☆13Mar 22, 2024Updated 2 years ago
facebookresearch / FlowDec
View on GitHub
An neural full-band audio codec for general audio sampled at 48 kHz with 7.5 kps or 4.5 kbps.
☆212Jun 22, 2026Updated 3 weeks ago
pengzhendong / wavesurfer
View on GitHub
For audio visualization and playback in Jupyter notebooks.
☆18Nov 25, 2025Updated 7 months ago
AI-Hypercomputer / ray-tpu
View on GitHub
☆15May 11, 2025Updated last year
google / df-conformer
View on GitHub
Audio samples accompanying publications related to DF-Conformer, a speech enhancement model.
☆36Jun 23, 2026Updated 3 weeks ago
Deploy to Railway using AI coding agents - Free Credits Offer • Ad
Use Claude Code, Codex, OpenCode, and more. Autonomous software development now has the infrastructure to match with Railway.
BASHLab / OWL
View on GitHub
☆15May 25, 2026Updated last month
BingYang-20 / DP-RTF-Learning
View on GitHub
A python implementation of “Learning Deep Direct-Path Relative Transfer Function for Binaural Sound Source Localization” [TASLP 2021]
☆28Feb 11, 2023Updated 3 years ago
DTaoo / Simplified_DMC
View on GitHub
A simplified version for DMC (Deep Multimodal Clustering for Unsupervised Audiovisual Learning)
☆19May 27, 2020Updated 6 years ago
Mddct / cosyvoice2-flow-optimized
View on GitHub
faster inference
☆27Jan 20, 2025Updated last year
BolinLai / CSTS
View on GitHub
[ECCV2024] The official implementation of "Listen to Look into the Future: Audio-Visual Egocentric Gaze Anticipation".
☆16Feb 24, 2025Updated last year
stoneMo / CIGN
View on GitHub
Official implementation for CIGN
☆17Sep 11, 2023Updated 2 years ago
guyyariv / AudioToken
View on GitHub
[InterSpeech 2023] The official PyTorch implementation of: "AudioToken: Adaptation of Text-Conditioned Diffusion Models for Audio-to-Imag…
☆89May 18, 2026Updated 2 months ago
Bartelds / ctc-dro
View on GitHub
Code associated with the paper: CTC-DRO: Robust Optimization for Reducing Language Disparities in Speech Recognition.
☆17May 16, 2025Updated last year
stoneMo / SLAVC
View on GitHub
Official Codebase of "A Closer Look at Weakly-Supervised Audio-Visual Source Localization" (NeurIPS 2022)
☆21Dec 6, 2022Updated 3 years ago
Wordpress hosting with auto-scaling - Free Trial Offer • Ad
Fully Managed hosting for WordPress and WooCommerce businesses that need reliable, auto-scalable performance. Cloudways SafeUpdates now available.
donghoney0416 / DeepASA
View on GitHub
Official page of "DeepASA: An Object-Oriented Multi-Purpose Network for Auditory Scene Analysis"
☆26Apr 15, 2026Updated 3 months ago
Mddct / transformer-vocos
View on GitHub
☆35Sep 6, 2025Updated 10 months ago
qiuqiangkong / audioflow
View on GitHub
☆128Updated this week
Yuer867 / EMO_Harmonizer
View on GitHub
This is the official repository of Emotion-Driven Melody Harmonization via Melodic Variation and Functional Representation.
☆12Sep 25, 2024Updated last year
marl / SpatialScaper
View on GitHub
☆75Aug 7, 2025Updated 11 months ago
cyh-0 / CAVP
View on GitHub
Official code for "A Closer Look at Audio-Visual Segmentation"
☆97Oct 31, 2025Updated 8 months ago
JiuFengSC / ElasticAST
View on GitHub
Official code of ElasticAST (Interspeech 2024 paper)
☆34Jul 30, 2024Updated last year