okankop/ASDNet

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/okankop/ASDNet)

okankop / ASDNet

Audio-Visual Active Speaker Detection with PyTorch on AVA-ActiveSpeaker dataset

☆73

Alternatives and similar repositories for ASDNet

Users that are interested in ASDNet are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

fuankarion / active-speakers-context
View on GitHub
Code for the Active Speakers in Context Paper (CVPR2020)
☆58May 19, 2021Updated 5 years ago
Tiago-Roxo / WASD
View on GitHub
☆20Mar 20, 2026Updated 4 months ago
TaoRuijie / TalkNet-ASD
View on GitHub
ACM MM 2021: 'Is Someone Speaking? Exploring Long-term Temporal Features for Audio-visual Active Speaker Detection'
☆488Oct 23, 2023Updated 2 years ago
SRA2 / SPELL
View on GitHub
Learning Long-Term Spatial-Temporal Graphs for Active Speaker Detection (ECCV 2022)
☆67Oct 29, 2023Updated 2 years ago
tuanchien / asd
View on GitHub
Active Speaker Detection
☆19Jun 19, 2020Updated 6 years ago
1-Click AI Models by DigitalOcean Gradient • Ad
Deploy popular AI models on DigitalOcean Gradient GPU virtual machines with just a single click. Zero configuration with optimized deployments.
Martlgap / x-face-verification
View on GitHub
Repo for our Paper: Explainable Model-Agnostic Similarity and Confidence in Face Verification
☆18Sep 28, 2023Updated 2 years ago
Junhua-Liao / Light-ASD
View on GitHub
The repository for IEEE CVPR 2023 (A Light Weight Model for Active Speaker Detection)
☆181Mar 23, 2025Updated last year
DanielMengLiu / DeepLip
View on GitHub
deep-learning based audio-visual lip bometrics
☆15May 9, 2023Updated 3 years ago
zcxu-eric / Ego4d_TalkNet_ASD
View on GitHub
☆21Feb 15, 2022Updated 4 years ago
tteepe / EarlyBird
View on GitHub
Official Code for "EarlyBird: Early-Fusion for Multi-View Tracking in the Bird's Eye View"
☆60Mar 12, 2024Updated 2 years ago
Martlgap / face-alignment-mtcnn
View on GitHub
A lightweight python implementation of face alignment with MTCNN landmarks using tensorflow-lite runtime
☆30Mar 31, 2025Updated last year
fubel / synthehicle
View on GitHub
[WACVW 2023] A massive synthetic dataset for 3D multi-target multi-camera tracking and segmentation.
☆54Jul 12, 2023Updated 3 years ago
Martlgap / xqlfw
View on GitHub
Repo for our Paper: Cross Quality LFW: A database for Analyzing Cross-Resolution Image Face Recognition in Unconstrained Environments
☆19Nov 25, 2022Updated 3 years ago
mertkayhan / SSL-2D-Pose
View on GitHub
Code for the paper "Deep Attention Based Semi-Supervised 2D-Pose Estimation for Surgical Instruments"
☆12Dec 14, 2019Updated 6 years ago
Deploy on Railway without the complexity - Free Credits Offer • Ad
Connect your repo and Railway handles the rest with instant previews. Quickly provision container image services, databases, and storage volumes.
okankop / Dissected-3D-CNNs
View on GitHub
☆15Oct 8, 2020Updated 5 years ago
SJTUwxz / LoCoNet_ASD
View on GitHub
code repo for LoCoNet: Long-Short Context Network for Active Speaker Detection
☆57May 1, 2023Updated 3 years ago
tteepe / CenterNet-pytorch-lightning
View on GitHub
Refactored implementation of CenterNet (Objects as Points - Zhou, Xingyi et. al.) shipping with PyTorch Lightning modules
☆59Mar 10, 2023Updated 3 years ago
aispeech-lab / advr-avss
View on GitHub
Pytorch implementation of our paper: Audio-Visual Speech Separation with Visual Features Enhanced by Adversarial Training.
☆18Jul 11, 2022Updated 4 years ago
tteepe / GaitGraph2
View on GitHub
Official code for "Towards a Deeper Understanding of Skeleton-based Gait Recognition" (CVPRW'22)
☆46Apr 19, 2022Updated 4 years ago
zcxu-eric / AVA-AVD
View on GitHub
☆51Nov 24, 2022Updated 3 years ago
clovaai / lookwhostalking
View on GitHub
Look Who’s Talking: Active Speaker Detection in the Wild
☆76Aug 24, 2023Updated 2 years ago
rash1993 / movie-asd
View on GitHub
repo for active speaker detection for media videos.
☆31Nov 19, 2023Updated 2 years ago
ciodar / UniversalAttribution
View on GitHub
[ECCVW/TWYN 2024 - Best Workshop Paper] Are CLIP features all you need for Universal Synthetic Image Origin Attribution?
☆14Mar 27, 2026Updated 3 months ago
Deploy to Railway using AI coding agents - Free Credits Offer • Ad
Use Claude Code, Codex, OpenCode, and more. Autonomous software development now has the infrastructure to match with Railway.
uark-cviu / Right2Talk
View on GitHub
[ICCV'21] The Right to Talk: An Audio-Visual Transformer Approach
☆20Aug 2, 2021Updated 4 years ago
afourast / avobjects
View on GitHub
Implementation for ECCV20 paper "Self-Supervised Learning of audio-visual objects from video"
☆114Nov 16, 2020Updated 5 years ago
LMSAudio / Complex_PF
View on GitHub
RES via complex-valued DNN
☆26Sep 3, 2021Updated 4 years ago
zexupan / MuSE
View on GitHub
☆42Nov 22, 2024Updated last year
Junhua-Liao / LR-ASD
View on GitHub
The repository for Springer IJCV 2025 (LR-ASD: Lightweight and Robust Network for Active Speaker Detection)
☆131Mar 23, 2025Updated last year
mayunxi / mpp_rtsp_play_QT
View on GitHub
☆11Mar 30, 2020Updated 6 years ago
TaoRuijie / SEANet
View on GitHub
Code for Audio-Visual Target Speaker Extraction with Selective Auditory Attention (TASLP)
☆32Feb 28, 2025Updated last year
vskadandale / vocalist
View on GitHub
Official repository for the paper VocaLiST: An Audio-Visual Synchronisation Model for Lips and Voices
☆73Apr 7, 2024Updated 2 years ago
danmic / av-se
View on GitHub
Deep-Learning-Based Audio-Visual Speech Enhancement and Separation
☆222Apr 16, 2023Updated 3 years ago
1-Click AI Models by DigitalOcean Gradient • Ad
Deploy popular AI models on DigitalOcean Gradient GPU virtual machines with just a single click. Zero configuration with optimized deployments.
fubel / stmodeling
View on GitHub
Code for the paper "Comparative Analysis of CNN-based Spatiotemporal Reasoning in Videos"
☆14May 3, 2024Updated 2 years ago
aispeech-lab / TinyWASE
View on GitHub
PyTorch implementation of TinyWASE described in our paper "Compressing Speaker Extraction Model with Ultra-low Precision Quantization and…
☆11Jun 28, 2021Updated 5 years ago
dtransposed / MLP-Mixer
View on GitHub
PyTorch implementation of MLP-Mixer architecture.
☆12May 24, 2021Updated 5 years ago
tteepe / GaitGraph
View on GitHub
Official repository for "GaitGraph: Graph Convolutional Network for Skeleton-Based Gait Recognition" (ICIP'21)
☆112Apr 19, 2022Updated 4 years ago
Martlgap / octuplet-loss
View on GitHub
Repo for our Paper: Octuplet Loss: Make Your Face Recognition Model Robust to Image Resolution
☆53Nov 27, 2023Updated 2 years ago
ms-dot-k / LRW_ID
View on GitHub
The speaker-labeled information of LRW dataset, which is the outcome of the paper "Speaker-adaptive Lip Reading with User-dependent Paddi…
☆10Oct 12, 2023Updated 2 years ago
iariav / End-to-End-VAD
View on GitHub
an Audio-Visual Voice Activity Detection using Deep Learning
☆52Apr 7, 2019Updated 7 years ago