zcxu-eric/Ego4d_TalkNet_ASD

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/zcxu-eric/Ego4d_TalkNet_ASD)

zcxu-eric / Ego4d_TalkNet_ASD

☆21

Alternatives and similar repositories for Ego4d_TalkNet_ASD

Users that are interested in Ego4d_TalkNet_ASD are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

EGO4D / audio-visual
View on GitHub
☆69Sep 13, 2022Updated 3 years ago
SAGNIKMJR / ego-AV-spatial-correspondence
View on GitHub
[CVPR 2024] Code and datasets for 'Learning Spatial Features from Audio-Visual Correspondence in Egocentric Videos'
☆14Jun 16, 2024Updated 2 years ago
tuanchien / asd
View on GitHub
Active Speaker Detection
☆19Jun 19, 2020Updated 6 years ago
TaoRuijie / TalkNet-ASD
View on GitHub
ACM MM 2021: 'Is Someone Speaking? Exploring Long-term Temporal Features for Audio-visual Active Speaker Detection'
☆489Oct 23, 2023Updated 2 years ago
fuankarion / active-speakers-context
View on GitHub
Code for the Active Speakers in Context Paper (CVPR2020)
☆58May 19, 2021Updated 5 years ago
Simple, predictable pricing with DigitalOcean hosting • Ad
Always know what you'll pay with monthly caps and flat pricing. Enterprise-grade infrastructure trusted by 600k+ customers.
okankop / ASDNet
View on GitHub
Audio-Visual Active Speaker Detection with PyTorch on AVA-ActiveSpeaker dataset
☆73Jan 18, 2022Updated 4 years ago
facebookresearch / EgoCom-Dataset
View on GitHub
EgoCom: A Multi-person Multi-modal Egocentric Communications Dataset
☆63Nov 23, 2020Updated 5 years ago
zcxu-eric / AVA-AVD
View on GitHub
☆51Nov 24, 2022Updated 3 years ago
Jackson-Kang / MFARunner
View on GitHub
A simple tool to easily use Montreal Forced Aligner. Also provide alignment(TextGrid) retrieved from ESD.
☆45May 25, 2023Updated 3 years ago
changelinglab / prism
View on GitHub
A toolkit and benchmark for evaluating phonetic capabilities of speech models.
☆18Apr 10, 2026Updated 3 months ago
v-iashin / SparseSync
View on GitHub
Source code for "Sparse in Space and Time: Audio-visual Synchronisation with Trainable Selectors." (Spotlight at the BMVC 2022)
☆56Jan 29, 2024Updated 2 years ago
liutaocode / DiarizationVisualization
View on GitHub
Visualization tools for audio-only and multi-modal speaker diarization dataset
☆13Oct 27, 2023Updated 2 years ago
Tiago-Roxo / WASD
View on GitHub
☆20Updated this week
SRA2 / SPELL
View on GitHub
Learning Long-Term Spatial-Temporal Graphs for Active Speaker Detection (ECCV 2022)
☆67Oct 29, 2023Updated 2 years ago
Deploy to Railway using AI coding agents - Free Credits Offer • Ad
Use Claude Code, Codex, OpenCode, and more. Autonomous software development now has the infrastructure to match with Railway.
xiaoneil / LPNet
View on GitHub
☆13Nov 28, 2021Updated 4 years ago
Cloudox / OXHanoiDemo
View on GitHub
汉诺塔游戏，输入层数，自动绘制并自动动画展现解题过程
☆12May 18, 2017Updated 9 years ago
mdx-tutorial / mdx-tutorial.github.io
View on GitHub
Tutorial covering Open Source tools for Source Separation.
☆15Nov 12, 2021Updated 4 years ago
karchkha / MelSpec_GPT_VQVAE
View on GitHub
Audio Generation model working with GPT-2 and VQVAE compressed representation of MelSpectrograms
☆18Oct 8, 2023Updated 2 years ago
gpu-poor / gramvaani_hindi_asr
View on GitHub
This repo contains the baseline model recipes and pre-trained model for GramVanni hindi ASR challenge
☆16Mar 26, 2022Updated 4 years ago
YoungSeng / ReprGesture
View on GitHub
The ReprGesture entry to the GENEA Challenge 2022 (IMCI 2022)
☆16Nov 8, 2022Updated 3 years ago
yesheng-THU / GFGE
View on GitHub
GFGE
☆15Sep 7, 2022Updated 3 years ago
gyx-gloria / DMT
View on GitHub
Official Implementation of DMT: Dual Mean-Teacher in PyTorch.
☆10Oct 27, 2023Updated 2 years ago
gli-27 / voca-pytorch
View on GitHub
☆10Jan 5, 2020Updated 6 years ago
Deploy to Railway using AI coding agents - Free Credits Offer • Ad
Use Claude Code, Codex, OpenCode, and more. Autonomous software development now has the infrastructure to match with Railway.
alibaba-mmai-research / Masked-Action-Recognition
View on GitHub
Official code for the paper: MAR: Masked Autoencoders for Efficient Action Recognition
☆32Dec 7, 2022Updated 3 years ago
Boeun-Kim / GL-Transformer
View on GitHub
This is the official implementation of Global-local Motion Transformer for Unsupervised Skeleton-based Action Learning (ECCV 2022).
☆23Nov 6, 2023Updated 2 years ago
jwr1995 / WD-TCN
View on GitHub
☆11Aug 5, 2022Updated 3 years ago
xjchenGit / SingGraph
View on GitHub
Official repository for the paper Singing Voice Graph Modeling for SingFake Detection (Interspeech 2024).
☆24Sep 19, 2025Updated 10 months ago
iLearn-Lab / MM23-RTQ
View on GitHub
ACM Multimedia 2023 (Oral) - RTQ: Rethinking Video-language Understanding Based on Image-text Model
☆15Apr 7, 2026Updated 3 months ago
vineetjohn / research-review-notes
View on GitHub
Research Paper Review Notes
☆13Oct 26, 2018Updated 7 years ago
EGO4D / social-interactions
View on GitHub
☆56Aug 7, 2022Updated 3 years ago
PoTaTo-Mika / Shore-Data-Engine
View on GitHub
A codebase for data crawling and preprocessing for TTS and ASR systems training.
☆23Jun 13, 2026Updated last month
JeongHun0716 / e-mvsr
View on GitHub
Efficient Training for Multilingual Visual Speech Recognition: Pre-training with Discretized Visual Speech Representation (ACM MM 2024)
☆20Mar 17, 2025Updated last year
Managed Database hosting by DigitalOcean • Ad
PostgreSQL, MySQL, MongoDB, Kafka, Valkey, and OpenSearch available. Automatically scale up storage and focus on building your apps.
Minglu58 / TA2V
View on GitHub
☆16Dec 1, 2025Updated 7 months ago
yu-haoyuan / fd-badcat
View on GitHub
fd-sds
☆20Apr 8, 2026Updated 3 months ago
shincling / discreteSeparation
View on GitHub
The demo for "Discretization and Re-synthesis: an alternative method to solve the Cocktail Party Problem".
☆12Oct 25, 2021Updated 4 years ago
sigal-raab / Motion
View on GitHub
Motion classes, based on Holden's code http://theorangeduck.com/page/deep-learning-framework-character-motion-synthesis-and-editing
☆26Apr 23, 2026Updated 3 months ago
stogiannidis / srbench
View on GitHub
Source code for the Paper "Mind the Gap: Benchmarking Spatial Reasoning in Vision-Language Models"
☆19Feb 1, 2026Updated 5 months ago
srama2512 / mapnet-pytorch
View on GitHub
Unofficial PyTorch implementation of MapNet: An Allocentric Spatial Memory for Mapping Environments
☆12Jun 4, 2020Updated 6 years ago
EGO4D / episodic-memory
View on GitHub
☆139May 30, 2024Updated 2 years ago