fuankarion/active-speakers-context

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/fuankarion/active-speakers-context)

fuankarion / active-speakers-context

Code for the Active Speakers in Context Paper (CVPR2020)

☆58

Alternatives and similar repositories for active-speakers-context

Users that are interested in active-speakers-context are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

okankop / ASDNet
View on GitHub
Audio-Visual Active Speaker Detection with PyTorch on AVA-ActiveSpeaker dataset
☆73Jan 18, 2022Updated 4 years ago
tuanchien / asd
View on GitHub
Active Speaker Detection
☆19Jun 19, 2020Updated 6 years ago
TaoRuijie / TalkNet-ASD
View on GitHub
ACM MM 2021: 'Is Someone Speaking? Exploring Long-term Temporal Features for Audio-visual Active Speaker Detection'
☆489Oct 23, 2023Updated 2 years ago
Overcautious / ADENet
View on GitHub
Accepted by TMM 2022
☆19Aug 18, 2022Updated 3 years ago
zcxu-eric / Ego4d_TalkNet_ASD
View on GitHub
☆21Feb 15, 2022Updated 4 years ago
Serverless GPU API endpoints on Runpod - Get Bonus Credits • Ad
Skip the infrastructure headaches. Auto-scaling, pay-as-you-go, no-ops approach lets you focus on innovating your application.
Tiago-Roxo / WASD
View on GitHub
☆20Updated this week
cvdfoundation / ava-dataset
View on GitHub
The AVA dataset densely annotates 80 atomic visual actions in 351k movie clips with actions localized in space and time, resulting in 1.6…
☆347Feb 9, 2022Updated 4 years ago
wbengine / SPMILM
View on GitHub
☆18Apr 12, 2017Updated 9 years ago
Junhua-Liao / Light-ASD
View on GitHub
The repository for IEEE CVPR 2023 (A Light Weight Model for Active Speaker Detection)
☆181Mar 23, 2025Updated last year
joonson / syncnet_python
View on GitHub
Out of time: automated lip sync in the wild
☆894Apr 17, 2026Updated 3 months ago
DanielMengLiu / DeepLip
View on GitHub
deep-learning based audio-visual lip bometrics
☆15May 9, 2023Updated 3 years ago
EGO4D / audio-visual
View on GitHub
☆69Sep 13, 2022Updated 3 years ago
Jiang-Yidi / TS-TalkNet
View on GitHub
INTERSPEECH2023: Target Active Speaker Detection with Audio-visual Cues
☆61May 29, 2023Updated 3 years ago
v-iashin / SparseSync
View on GitHub
Source code for "Sparse in Space and Time: Audio-visual Synchronisation with Trainable Selectors." (Spotlight at the BMVC 2022)
☆56Jan 29, 2024Updated 2 years ago
Wordpress hosting with auto-scaling - Free Trial Offer • Ad
Fully Managed hosting for WordPress and WooCommerce businesses that need reliable, auto-scalable performance. Cloudways SafeUpdates now available.
afourast / avobjects
View on GitHub
Implementation for ECCV20 paper "Self-Supervised Learning of audio-visual objects from video"
☆114Nov 16, 2020Updated 5 years ago
biboamy / AVASpeech_Music_Labels
View on GitHub
☆20Nov 3, 2021Updated 4 years ago
clovaai / lookwhostalking
View on GitHub
Look Who’s Talking: Active Speaker Detection in the Wild
☆76Aug 24, 2023Updated 2 years ago
showlab / AVA-AVD
View on GitHub
☆22Nov 24, 2022Updated 3 years ago
zcxu-eric / AVA-AVD
View on GitHub
☆51Nov 24, 2022Updated 3 years ago
danmic / av-se
View on GitHub
Deep-Learning-Based Audio-Visual Speech Enhancement and Separation
☆222Apr 16, 2023Updated 3 years ago
SJTUwxz / LoCoNet_ASD
View on GitHub
code repo for LoCoNet: Long-Short Context Network for Active Speaker Detection
☆57May 1, 2023Updated 3 years ago
vskadandale / vocalist
View on GitHub
Official repository for the paper VocaLiST: An Audio-Visual Synchronisation Model for Lips and Voices
☆73Apr 7, 2024Updated 2 years ago
roger-tseng / av-superb
View on GitHub
A Multi-Task Evaluation Benchmark for Audio-Visual Representation Models (ICASSP 2024)
☆58Apr 17, 2024Updated 2 years ago
Deploy open-source AI quickly and easily - Special Bonus Offer • Ad
Runpod Hub is built for open source. One-click deployment and autoscaling endpoints without provisioning your own infrastructure.
uark-cviu / Right2Talk
View on GitHub
[ICCV'21] The Right to Talk: An Audio-Visual Transformer Approach
☆20Aug 2, 2021Updated 4 years ago
kimsunwiub / BLOOM-Net
View on GitHub
Source code for "BLOOM-Net: Blockwise Optimization for Masking Networks Toward Scalable and Efficient Speech Enhancement"
☆14Feb 13, 2022Updated 4 years ago
Yuanbo2020 / Audio-Visual-VAD
View on GitHub
☆13May 9, 2022Updated 4 years ago
afourast / deep_lip_reading
View on GitHub
Code and models for evaluating a state-of-the-art lip reading network
☆196Mar 24, 2023Updated 3 years ago
ultmaster / utilsd
View on GitHub
Common deep learning utils.
☆18Nov 1, 2023Updated 2 years ago
ga642381 / SpeechGen
View on GitHub
《SpeechGen: Unlocking the Generative Power of Speech Language Models with Prompts》
☆77Jun 9, 2023Updated 3 years ago
cmpute / audio-codec-benchmark
View on GitHub
Comprehensive quantitative comparison of lossless and lossy audio codecs
☆41Feb 11, 2023Updated 3 years ago
uw-x / AcousticSwarms-Speech
View on GitHub
☆30Sep 23, 2023Updated 2 years ago
zexupan / MuSE
View on GitHub
☆42Nov 22, 2024Updated last year
Bare Metal GPUs on DigitalOcean Gradient AI • Ad
Purpose-built for serious AI teams training foundational models, running large-scale inference, and pushing the boundaries of what's possible.
pengzhendong / torchfa
View on GitHub
Torch Audio Forced Aligner for Mixed Chinese (Mandarin or Cantonese) and English.
☆61Sep 5, 2025Updated 10 months ago
isjwdu / DFADD
View on GitHub
Official Implementation and Dataset of paper - DFADD: The Diffusion and Flow-matching based Audio Deepfake Dataset
☆16Apr 7, 2025Updated last year
smeetrs / deep_avsr
View on GitHub
A PyTorch implementation of the Deep Audio-Visual Speech Recognition paper.
☆244Feb 15, 2024Updated 2 years ago
XiangzhuKong / CA-Dense-UNet
View on GitHub
An unofficial code reproduction of Channel Attention Dense U-Net for Multichannel Speech Enhancement
☆13Jul 17, 2023Updated 3 years ago
mpc001 / Lipreading_using_Temporal_Convolutional_Networks
View on GitHub
ICASSP'22 Training Strategies for Improved Lip-Reading; ICASSP'21 Towards Practical Lipreading with Distilled and Efficient Models; ICASS…
☆438May 18, 2023Updated 3 years ago
YapengTian / AVVP-ECCV20
View on GitHub
Unified Multisensory Perception: Weakly-Supervised Audio-Visual Video Parsing, ECCV, 2020. (Spotlight)
☆90Jul 25, 2024Updated 2 years ago
dynamic-superb / multimodal-llama
View on GitHub
The official implementation of ImageBind-LLM and Whisper-LLM from the paper "Dynamic-SUPERB: Towards A Dynamic, Collaborative, and Compre…
☆21Oct 30, 2023Updated 2 years ago