plnguyen2908/LASER_ASD

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/plnguyen2908/LASER_ASD)

plnguyen2908 / LASER_ASD

[WACV 2026 Oral] LASER: Lip Landmark Assisted Speaker Detection for Robustness official implemntation

☆30

Alternatives and similar repositories for LASER_ASD

Users that are interested in LASER_ASD are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

kaistmm / TalkNCE
View on GitHub
Official implementation of TalkNCE (ICASSP 2024).
☆18Apr 30, 2025Updated last year
Junhua-Liao / LR-ASD
View on GitHub
The repository for Springer IJCV 2025 (LR-ASD: Lightweight and Robust Network for Active Speaker Detection)
☆132Mar 23, 2025Updated last year
kamata1729 / SDXL_controlnet_inpait_img2img_pipelines
View on GitHub
☆13Aug 24, 2023Updated 2 years ago
WeChatCV / D-ORCA
View on GitHub
D-ORCA: Dialogue-Centric Optimization for Robust Audio-Visual Captioning
☆15Feb 11, 2026Updated 5 months ago
malteprinzler / match
View on GitHub
Official Code for the CVPR 2026 Paper "MATCH: Feed-forward Gaussian Registration for Head Avatar Creation and Editing"
☆16Jun 6, 2026Updated last month
Managed hosting for WordPress and PHP on Cloudways • Ad
Managed hosting for WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Deploy in minutes on Cloudways by DigitalOcean.
Jiang-Yidi / TS-TalkNet
View on GitHub
INTERSPEECH2023: Target Active Speaker Detection with Audio-visual Cues
☆61May 29, 2023Updated 3 years ago
wangyz1999 / syncnet-speaker-diarization
View on GitHub
Identifying "who speak when" using visual speech input and pretrained lip-sync expert
☆18Jul 1, 2023Updated 3 years ago
purbeshmitra / MOTIF
View on GitHub
MOTIF: Modular Thinking via Reinforcement Fine-tuning in LLMs
☆17Jul 6, 2025Updated last year
kaistmm / AVCD
View on GitHub
[NeurIPS 2025] AVCD: Mitigating Hallucinations in Audio-Visual Large Language Models through Contrastive Decoding
☆27Nov 3, 2025Updated 8 months ago
griko / vanpy
View on GitHub
☆19Jul 23, 2025Updated last year
laitifranz / AR-SMPLX
View on GitHub
Animation of an SMPLX character in an augmented reality application
☆19Aug 22, 2024Updated last year
Tsinghua-MARS-Lab / NeuralDubber
View on GitHub
The project page repo for Neural Dubber.
☆30Sep 20, 2023Updated 2 years ago
Guohanzhong / OSA-LCM
View on GitHub
☆25Dec 19, 2024Updated last year
adobe-research / llava-score
View on GitHub
☆11Oct 2, 2024Updated last year
GPU virtual machines on DigitalOcean Gradient AI • Ad
Get to production fast with high-performance AMD and NVIDIA GPUs you can spin up in seconds. The definition of operational simplicity.
huangruizhe / audio
View on GitHub
Data manipulation and transformation for audio signal processing, powered by PyTorch
☆10Sep 30, 2024Updated last year
TaoRuijie / SEANet
View on GitHub
Code for Audio-Visual Target Speaker Extraction with Selective Auditory Attention (TASLP)
☆32Feb 28, 2025Updated last year
X-zhangyang / SelfPIFu--PIFu-for-the-Real-World
View on GitHub
Dressed Human Reconstrcution from Single-view Real World Image
☆25Mar 25, 2024Updated 2 years ago
Junhua-Liao / Light-ASD
View on GitHub
The repository for IEEE CVPR 2023 (A Light Weight Model for Active Speaker Detection)
☆181Mar 23, 2025Updated last year
ms-dot-k / Visual-Context-Attentional-GAN
View on GitHub
PyTorch implementation of "Lip to Speech Synthesis with Visual Context Attentional GAN" (NeurIPS2021)
☆25Mar 9, 2024Updated 2 years ago
njyeung / reels
View on GitHub
instagram reels in your terminal
☆25Updated this week
PalabraAI / redimnet2
View on GitHub
This repository contains the official implementation and pretrained weights for the paper "ReDimNet2: Scaling Speaker Verification via Ti…
☆65Jul 9, 2026Updated 2 weeks ago
LouisFinner / HiM2SAM
View on GitHub
This is the official implementation of work HiM2SAM in PRCV25.
☆29Aug 30, 2025Updated 10 months ago
JuanFMontesinos / Acappella-YNet
View on GitHub
Official implementation of A cappella: Audio-visual Singing VoiceSeparation, from BMVC21
☆18May 14, 2022Updated 4 years ago
Deploy open-source AI quickly and easily - Special Bonus Offer • Ad
Runpod Hub is built for open source. One-click deployment and autoscaling endpoints without provisioning your own infrastructure.
amazon-science / iwslt-autodub-task
View on GitHub
☆21Mar 4, 2024Updated 2 years ago
mpc001 / Visual_Speech_Recognition_for_Multiple_Languages
View on GitHub
Visual Speech Recognition for Multiple Languages
☆478Aug 17, 2023Updated 2 years ago
kjw11 / Speaker-Aware-CTC
View on GitHub
Speaker-aware CTC (SACTC) for multi-talker overlapped speech recognition.
☆22May 26, 2025Updated last year
madhavlab / 2022_syncnet
View on GitHub
SyncNet for Time Synchronization
☆30Mar 13, 2023Updated 3 years ago
EGO4D / audio-visual
View on GitHub
☆69Sep 13, 2022Updated 3 years ago
BeautyyuYanli / GPT-SoVITS-Infer
View on GitHub
The inference code of RVC-Boss/GPT-SoVITS that can be developer-friendly.
☆16Sep 29, 2024Updated last year
fclearner / Personal-vad-2.0
View on GitHub
Implementation of "Personal VAD 2.0: Optimizing Personal Voice Activity Detection for On-Device Speech Recognition"
☆16Jun 9, 2026Updated last month
hugoliv / projectvertices
View on GitHub
☆11Dec 19, 2020Updated 5 years ago
x-d-wang / Soft-Person-Reidentification-Network-Pruning-via-Blockwise-Adjacent-Filter-Decaying
View on GitHub
☆16Jan 13, 2024Updated 2 years ago
Managed hosting for WordPress and PHP on Cloudways • Ad
Managed hosting for WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Deploy in minutes on Cloudways by DigitalOcean.
SwinTransformer / Simple-21K-Detection
View on GitHub
☆13Jul 20, 2022Updated 4 years ago
ashawkey / fbxloader
View on GitHub
FBX file loader for python (only supports geometry currently)
☆17Aug 5, 2024Updated last year
nowickam / facial-animation
View on GitHub
Audio-driven facial animation generator with BiLSTM used for transcribing the speech and web interface displaying the avatar and the anim…
☆36Jul 14, 2022Updated 4 years ago
francescotonini / al-gtd
View on GitHub
Official repo of the paper “AL-GTD: Deep Active Learning for Gaze Target Detection” (ACMMM2024)
☆12Updated this week
Sindhu-Hegde / gestsync
View on GitHub
Official code for the paper "GestSync: Determining who is speaking without a talking head" published at BMVC 2023
☆48Sep 1, 2024Updated last year
griffbr / TFOD
View on GitHub
Task-Focused Few-Shot Object Detection Benchmark
☆14Jun 24, 2025Updated last year
patrickvonplaten / Wav2Vec2_ParlanceCTCDecode
View on GitHub
☆11Nov 5, 2021Updated 4 years ago