zcxu-eric / AVA-AVDView external linksLinks
☆49Nov 24, 2022Updated 3 years ago
Alternatives and similar repositories for AVA-AVD
Users that are interested in AVA-AVD are comparing it to the libraries listed below
Sorting:
- INTERSPEECH2023: Target Active Speaker Detection with Audio-visual Cues☆58May 29, 2023Updated 2 years ago
- [INTERSPEECH 2022] This dataset is designed for multi-modal speaker diarization and lip-speech synchronization in the wild.☆59Jan 24, 2024Updated 2 years ago
- ☆67Sep 13, 2022Updated 3 years ago
- A PyTorch implementation of End-to-End Neural Diarization☆109Jun 19, 2023Updated 2 years ago
- Clustering-based methods for overlapping diarization☆82Jan 12, 2024Updated 2 years ago
- ☆19Apr 18, 2024Updated last year
- Multi-Stage Face-Voice Association Learning with Keynote Speaker Diarization (ACM MM 2024)☆22Jul 25, 2024Updated last year
- ☆13Oct 25, 2024Updated last year
- The repository for IEEE CVPR 2023 (A Light Weight Model for Active Speaker Detection)☆165Mar 23, 2025Updated 10 months ago
- Dynamic vision-guided speaker embedding for audio-visual speaker diarization☆12Jul 5, 2022Updated 3 years ago
- Official Implementation and Dataset of paper - DFADD: The Diffusion and Flow-matching based Audio Deepfake Dataset☆15Apr 7, 2025Updated 10 months ago
- ACM MM 2021: 'Is Someone Speaking? Exploring Long-term Temporal Features for Audio-visual Active Speaker Detection'☆449Oct 23, 2023Updated 2 years ago
- This repository contains a set of codes to run (i.e., train, perform inference with, evaluate) a diarization method called EEND-vector-cl…☆79Oct 18, 2022Updated 3 years ago
- ☆91Apr 24, 2025Updated 9 months ago
- ☆14Feb 9, 2023Updated 3 years ago
- ☆21Nov 24, 2022Updated 3 years ago
- ☆32Jun 26, 2023Updated 2 years ago
- ☆14Jul 11, 2022Updated 3 years ago
- Look Who’s Talking: Active Speaker Detection in the Wild☆76Aug 24, 2023Updated 2 years ago
- ☆18Nov 22, 2024Updated last year
- ☆16Mar 7, 2019Updated 6 years ago
- Consistent dictionary learning algorithm for signal declipping (Python code)☆20Oct 24, 2018Updated 7 years ago
- CDER (Conversational Diarization Error Rate) Scoring Tool☆22Sep 13, 2022Updated 3 years ago
- ☆22Jun 30, 2021Updated 4 years ago
- ☆21Feb 15, 2022Updated 4 years ago
- ☆11Sep 4, 2023Updated 2 years ago
- VoxSRC2022 workshop development kit☆19Jul 21, 2022Updated 3 years ago
- ☆66Feb 8, 2024Updated 2 years ago
- CHIME-7/8 diarization champion system: neural speaker diarization using memory-aware multi-speaker embedding with sequence-to-sequence ar…☆83Jun 17, 2025Updated 7 months ago
- Tools for downloading VoxCeleb2 dataset☆33Mar 16, 2024Updated last year
- Active Speaker Detection☆19Jun 19, 2020Updated 5 years ago
- Source code of paper <End-to-End Language Diarization for Bilingual Code-switching Speech>☆19Jan 23, 2022Updated 4 years ago
- ☆20Dec 29, 2024Updated last year
- ☆18Sep 19, 2023Updated 2 years ago
- A pytorch implementation of the paper "ANSD-MA-MSE: Adaptive Neural Speaker Diarization Using Memory-Aware Multi-Speaker Embedding"☆60Sep 19, 2024Updated last year
- 1st place solution to the DCASE 2020 - Task 5 - Urban Sound Tagging with Spatiotemporal Context☆16Dec 8, 2022Updated 3 years ago
- SChunk-Encoder (Transformer or Conformer) for streaming E2E ASR☆11Oct 21, 2022Updated 3 years ago
- Spot the conversation: speaker diarisation in the wild☆157Jul 26, 2022Updated 3 years ago
- A bunch of scripts exploiting several tools to perform inverse text normalization (ITN)☆21Sep 27, 2017Updated 8 years ago