Junhua-Liao/LR-ASD

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/Junhua-Liao/LR-ASD)

Junhua-Liao / LR-ASD

The repository for Springer IJCV 2025 (LR-ASD: Lightweight and Robust Network for Active Speaker Detection)

☆132

Alternatives and similar repositories for LR-ASD

Users that are interested in LR-ASD are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

plnguyen2908 / LASER_ASD
View on GitHub
[WACV 2026 Oral] LASER: Lip Landmark Assisted Speaker Detection for Robustness official implemntation
☆30Feb 26, 2026Updated 5 months ago
Junhua-Liao / Light-ASD
View on GitHub
The repository for IEEE CVPR 2023 (A Light Weight Model for Active Speaker Detection)
☆181Mar 23, 2025Updated last year
plnguyen2908 / UniTalk-ASD-code
View on GitHub
[Interspeech 2026] Revisiting Active Speaker Detection: An In-the-Wild Benchmark for Generalization and Robustness
☆22Jun 25, 2026Updated last month
SJTUwxz / LoCoNet_ASD
View on GitHub
code repo for LoCoNet: Long-Short Context Network for Active Speaker Detection
☆57May 1, 2023Updated 3 years ago
ZXHY-82 / w2v-BERT-2.0_SV
View on GitHub
☆53Mar 28, 2026Updated 3 months ago
Virtual machines for every use case on DigitalOcean • Ad
Get dependable uptime with 99.99% SLA, simple security tools, and predictable monthly pricing with DigitalOcean's virtual machines, called Droplets.
Jiang-Yidi / TS-TalkNet
View on GitHub
INTERSPEECH2023: Target Active Speaker Detection with Audio-visual Cues
☆61May 29, 2023Updated 3 years ago
SSTC-Challenge / SSTC2024_baseline_system
View on GitHub
☆12Jun 14, 2024Updated 2 years ago
Tiago-Roxo / WASD
View on GitHub
☆20Updated this week
okankop / ASDNet
View on GitHub
Audio-Visual Active Speaker Detection with PyTorch on AVA-ActiveSpeaker dataset
☆73Jan 18, 2022Updated 4 years ago
TaoRuijie / MFV-KSD
View on GitHub
Multi-Stage Face-Voice Association Learning with Keynote Speaker Diarization (ACM MM 2024)
☆22Jul 25, 2024Updated 2 years ago
TaoRuijie / TalkNet-ASD
View on GitHub
ACM MM 2021: 'Is Someone Speaking? Exploring Long-term Temporal Features for Audio-visual Active Speaker Detection'
☆489Oct 23, 2023Updated 2 years ago
nguyenvulebinh / AVSRCocktail
View on GitHub
Audio-Visual Speech Recognition
☆26Jul 7, 2025Updated last year
X-LANCE / MSDWILD
View on GitHub
[INTERSPEECH 2022] This dataset is designed for multi-modal speaker diarization and lip-speech synchronization in the wild.
☆66Jan 24, 2024Updated 2 years ago
TaoRuijie / SEANet
View on GitHub
Code for Audio-Visual Target Speaker Extraction with Selective Auditory Attention (TASLP)
☆32Feb 28, 2025Updated last year
Managed Kubernetes at scale on DigitalOcean • Ad
DigitalOcean Kubernetes includes the control plane, bandwidth allowance, container registry, automatic updates, and more for free.
RobinWitch / DyStream
View on GitHub
☆37Feb 7, 2026Updated 5 months ago
mavceleb / mavceleb_baseline
View on GitHub
☆11Nov 5, 2025Updated 8 months ago
BUTSpeechFIT / SOT-DiCoW
View on GitHub
Multi-talker ASR based on DiCoW with Serialized Output Training
☆20Sep 18, 2025Updated 10 months ago
BASHLab / OWL
View on GitHub
☆15May 25, 2026Updated 2 months ago
rash1993 / movie-asd
View on GitHub
repo for active speaker detection for media videos.
☆31Nov 19, 2023Updated 2 years ago
kaistmm / TalkNCE
View on GitHub
Official implementation of TalkNCE (ICASSP 2024).
☆18Apr 30, 2025Updated last year
zcxu-eric / AVA-AVD
View on GitHub
☆51Nov 24, 2022Updated 3 years ago
xiaoxiaomiao323 / MSA
View on GitHub
☆16Feb 19, 2026Updated 5 months ago
DongKeon / Awesome-Speaker-Diarization
View on GitHub
Some comprehensive papers about speaker diarization
☆368Mar 24, 2026Updated 4 months ago
Wordpress hosting with auto-scaling - Free Trial Offer • Ad
Fully Managed hosting for WordPress and WooCommerce businesses that need reliable, auto-scalable performance. Cloudways SafeUpdates now available.
ryota-komatsu / speech_resynth
View on GitHub
Speech Resynthesis and Language Modeling
☆27Jun 11, 2025Updated last year
mubingshen / MLC-SLM-Baseline
View on GitHub
The project is associated with the recently-launched INTERSPEECH 2025 Workshop on Multilingual Conversational Speech Language Model (MLC-…
☆51May 14, 2025Updated last year
umbertocappellazzo / Llama-AVSR
View on GitHub
Official Pytorch implementation of "Large Language Models are Strong Audio-Visual Speech Recognition Learners" [ICASSP 2025] and "Mitigat…
☆64Jan 18, 2026Updated 6 months ago
ydk122024 / CCIM
View on GitHub
[CVPR2023] Context De-confounded Emotion Recognition
☆18Jul 23, 2023Updated 3 years ago
Dorniwang / SpeakerVid-5M-Code
View on GitHub
The official SpeakerVid-5M data curation code.
☆82Jul 23, 2025Updated last year
JaesungHuh / av-diarization
View on GitHub
Audio-visual diarization pipeline used for creating VoxConverse dataset
☆22Jun 6, 2025Updated last year
Bose / RAVEN
View on GitHub
☆20Oct 6, 2025Updated 9 months ago
FreedomIntelligence / TalkVid
View on GitHub
TalkVid: A Large-Scale Diversified Dataset for Audio-Driven Talking Head Synthesis [CVPR 2026 Findings]
☆193Jun 8, 2026Updated last month
wenet-e2e / wesep
View on GitHub
Target Speaker Extraction Toolkit
☆299Oct 4, 2025Updated 9 months ago
GPU virtual machines on DigitalOcean Gradient AI • Ad
Get to production fast with high-performance AMD and NVIDIA GPUs you can spin up in seconds. The definition of operational simplicity.
CarlWangChina / SaMoye-SVC
View on GitHub
dog-can-sing-song
☆59Jan 9, 2026Updated 6 months ago
ductuantruong / enskd
View on GitHub
[ICASSP'24] Emphasized Non-Target Speaker Knowledge in Knowledge Distillation for Speaker Verification
☆16Mar 20, 2024Updated 2 years ago
DanielMengLiu / DeepLip
View on GitHub
deep-learning based audio-visual lip bometrics
☆15May 9, 2023Updated 3 years ago
nttcslab-sp / mamba-diarization
View on GitHub
Official repository for Mamba-based Segmentation Model for Speaker Diarization
☆47May 13, 2025Updated last year
DonkeyShot21 / uis-rnn-sml
View on GitHub
A better, faster, stronger version of the unbounded interleaved-state recurrent neural network (UIS-RNN)
☆61Apr 15, 2020Updated 6 years ago
BUTSpeechFIT / TS-ASR-Whisper
View on GitHub
☆116Jun 29, 2026Updated 3 weeks ago
kaistmm / seed-pytorch
View on GitHub
[INTERSPEECH 2025] Official code for "SEED: Speaker Embedding Enhancement Diffusion Model"
☆59Nov 3, 2025Updated 8 months ago