SJTUwxz/LoCoNet_ASD

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/SJTUwxz/LoCoNet_ASD)

SJTUwxz / LoCoNet_ASD

code repo for LoCoNet: Long-Short Context Network for Active Speaker Detection

☆57

Alternatives and similar repositories for LoCoNet_ASD

Users that are interested in LoCoNet_ASD are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

kaistmm / TalkNCE
View on GitHub
Official implementation of TalkNCE (ICASSP 2024).
☆18Apr 30, 2025Updated last year
plnguyen2908 / LASER_ASD
View on GitHub
[WACV 2026 Oral] LASER: Lip Landmark Assisted Speaker Detection for Robustness official implemntation
☆30Feb 26, 2026Updated 5 months ago
plnguyen2908 / UniTalk-ASD-code
View on GitHub
[Interspeech 2026] Revisiting Active Speaker Detection: An In-the-Wild Benchmark for Generalization and Robustness
☆22Jun 25, 2026Updated last month
Tiago-Roxo / WASD
View on GitHub
☆20Updated this week
Junhua-Liao / Light-ASD
View on GitHub
The repository for IEEE CVPR 2023 (A Light Weight Model for Active Speaker Detection)
☆181Mar 23, 2025Updated last year
Managed hosting for WordPress and PHP on Cloudways • Ad
Managed hosting for WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Deploy in minutes on Cloudways by DigitalOcean.
TaoRuijie / TalkNet-ASD
View on GitHub
ACM MM 2021: 'Is Someone Speaking? Exploring Long-term Temporal Features for Audio-visual Active Speaker Detection'
☆489Oct 23, 2023Updated 2 years ago
Junhua-Liao / LR-ASD
View on GitHub
The repository for Springer IJCV 2025 (LR-ASD: Lightweight and Robust Network for Active Speaker Detection)
☆132Mar 23, 2025Updated last year
showlab / AVA-AVD
View on GitHub
☆22Nov 24, 2022Updated 3 years ago
SAGNIKMJR / ego-AV-spatial-correspondence
View on GitHub
[CVPR 2024] Code and datasets for 'Learning Spatial Features from Audio-Visual Correspondence in Egocentric Videos'
☆14Jun 16, 2024Updated 2 years ago
okankop / ASDNet
View on GitHub
Audio-Visual Active Speaker Detection with PyTorch on AVA-ActiveSpeaker dataset
☆73Jan 18, 2022Updated 4 years ago
ms-dot-k / Visual-Audio-Memory
View on GitHub
PyTorch implementation of "Multi-modality Associative Bridging through Memory: Speech Sound Recollected from Face Video" (ICCV2021)
☆22Apr 11, 2022Updated 4 years ago
Overcautious / ADENet
View on GitHub
Accepted by TMM 2022
☆19Aug 18, 2022Updated 3 years ago
Bose / RAVEN
View on GitHub
☆20Oct 6, 2025Updated 9 months ago
Jiang-Yidi / TS-TalkNet
View on GitHub
INTERSPEECH2023: Target Active Speaker Detection with Audio-visual Cues
☆61May 29, 2023Updated 3 years ago
Managed hosting for WordPress and PHP on Cloudways • Ad
Managed hosting for WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Deploy in minutes on Cloudways by DigitalOcean.
YasserdahouML / VSR_test_set
View on GitHub
WildVSR
☆22Dec 13, 2023Updated 2 years ago
TaoRuijie / SEANet
View on GitHub
Code for Audio-Visual Target Speaker Extraction with Selective Auditory Attention (TASLP)
☆32Feb 28, 2025Updated last year
EGO4D / audio-visual
View on GitHub
☆69Sep 13, 2022Updated 3 years ago
choijeongsoo / av2av
View on GitHub
[CVPR 2024] AV2AV: Direct Audio-Visual Speech to Audio-Visual Speech Translation with Unified Audio-Visual Speech Representation
☆48Sep 6, 2024Updated last year
griko / vanpy
View on GitHub
☆19Jul 23, 2025Updated last year
MarshallT-99 / VALLR
View on GitHub
☆29Oct 1, 2025Updated 9 months ago
ms-dot-k / Multi-head-Visual-Audio-Memory
View on GitHub
PyTorch implementation of "Distinguishing Homophenes using Multi-Head Visual-Audio Memory" (AAAI2022)
☆27Mar 9, 2024Updated 2 years ago
X-LANCE / MSDWILD
View on GitHub
[INTERSPEECH 2022] This dataset is designed for multi-modal speaker diarization and lip-speech synchronization in the wild.
☆66Jan 24, 2024Updated 2 years ago
JeongHun0716 / vsr-low
View on GitHub
Visual Speech Recognition For Low-Resource Languages with Automatic Labels (ICASSP 2024)
☆17Mar 17, 2025Updated last year
Proton VPN Special Offer - Get 70% off • Ad
Special partner offer. Trusted by over 100 million users worldwide. Tested, Approved and Recommended by Experts.
Sreyan88 / LipGER
View on GitHub
Code for InterSpeech 2024 Paper: LipGER: Visually-Conditioned Generative Error Correction for Robust Automatic Speech Recognition
☆19Jul 16, 2024Updated 2 years ago
AV-Reasoner / AV-Reasoner
View on GitHub
☆19Jul 22, 2025Updated last year
BolinLai / CSTS
View on GitHub
[ECCV2024] The official implementation of "Listen to Look into the Future: Audio-Visual Egocentric Gaze Anticipation".
☆16Feb 24, 2025Updated last year
klauscc / DAM
View on GitHub
Official code for DAM: Dynamic Adapter Merging for Continual Video QA Learning
☆15Apr 25, 2024Updated 2 years ago
wangyz1999 / syncnet-speaker-diarization
View on GitHub
Identifying "who speak when" using visual speech input and pretrained lip-sync expert
☆18Jul 1, 2023Updated 3 years ago
huangruizhe / audio
View on GitHub
Data manipulation and transformation for audio signal processing, powered by PyTorch
☆10Sep 30, 2024Updated last year
zaocan666 / DyViSE
View on GitHub
Dynamic vision-guided speaker embedding for audio-visual speaker diarization
☆12Jul 5, 2022Updated 4 years ago
kaistmm / voxceleb-disentangler
View on GitHub
[INTERSPEECH 2024] Official pytorch code for the paper "Disentangled Representation Learning for Environment-agnostic Speaker Recognition…
☆18Jul 23, 2024Updated 2 years ago
fuankarion / active-speakers-context
View on GitHub
Code for the Active Speakers in Context Paper (CVPR2020)
☆58May 19, 2021Updated 5 years ago
Deploy to Railway using AI coding agents - Free Credits Offer • Ad
Use Claude Code, Codex, OpenCode, and more. Autonomous software development now has the infrastructure to match with Railway.
ruohaoguo / avis
View on GitHub
[CVPR 2025] 🔥 Official impl. of "Audio-Visual Instance Segmentation".
☆52Jun 5, 2025Updated last year
joonson / syncnet_python
View on GitHub
Out of time: automated lip sync in the wild
☆894Apr 17, 2026Updated 3 months ago
pengzhendong / streaming-asr
View on GitHub
One command to start a streaming ASR server.
☆12Oct 2, 2024Updated last year
patrickvonplaten / Wav2Vec2_ParlanceCTCDecode
View on GitHub
☆11Nov 5, 2021Updated 4 years ago
HolgerBovbjerg / SSL-PVAD
View on GitHub
A repository for code used to produce the results the ICASSP 2024 paper: "SELF-SUPERVISED PRETRAINING FOR ROBUST PERSONALIZED VOICE ACTIV…
☆25Nov 25, 2024Updated last year
v-iashin / Synchformer
View on GitHub
Source code for "Synchformer: Efficient Synchronization from Sparse Cues" (ICASSP 2024)
☆130Sep 15, 2025Updated 10 months ago
EGO4D / ego-exo4d-egopose
View on GitHub
☆18Apr 16, 2024Updated 2 years ago