[INTERSPEECH 2022] This dataset is designed for multi-modal speaker diarization and lip-speech synchronization in the wild.
☆59Jan 24, 2024Updated 2 years ago
Alternatives and similar repositories for MSDWILD
Users that are interested in MSDWILD are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- ☆16Feb 19, 2026Updated last month
- INTERSPEECH2023: Target Active Speaker Detection with Audio-visual Cues☆58May 29, 2023Updated 2 years ago
- ☆50Nov 24, 2022Updated 3 years ago
- Dynamic vision-guided speaker embedding for audio-visual speaker diarization☆12Jul 5, 2022Updated 3 years ago
- The repository for IEEE CVPR 2023 (A Light Weight Model for Active Speaker Detection)☆170Mar 23, 2025Updated last year
- Managed Kubernetes at scale on DigitalOcean • AdDigitalOcean Kubernetes includes the control plane, bandwidth allowance, container registry, automatic updates, and more for free.
- ☆21Nov 24, 2022Updated 3 years ago
- Accepted by TMM 2022☆19Aug 18, 2022Updated 3 years ago
- ☆42Nov 22, 2024Updated last year
- ACM MM 2021: 'Is Someone Speaking? Exploring Long-term Temporal Features for Audio-visual Active Speaker Detection'☆462Oct 23, 2023Updated 2 years ago
- ☆64Jun 28, 2023Updated 2 years ago
- Audio-visual diarization pipeline used for creating VoxConverse dataset☆21Jun 6, 2025Updated 9 months ago
- Visualization tools for audio-only and multi-modal speaker diarization dataset☆13Oct 27, 2023Updated 2 years ago
- Spot the conversation: speaker diarisation in the wild☆158Jul 26, 2022Updated 3 years ago
- A CSRankings-like index for speech researchers☆35Oct 16, 2024Updated last year
- GPU virtual machines on DigitalOcean Gradient AI • AdGet to production fast with high-performance AMD and NVIDIA GPUs you can spin up in seconds. The definition of operational simplicity.
- Multi-Stage Face-Voice Association Learning with Keynote Speaker Diarization (ACM MM 2024)☆22Jul 25, 2024Updated last year
- ☆20Mar 20, 2026Updated last week
- Once more Diarization: Improving meeting transcription systems through segment-level speaker reassignment☆14Feb 5, 2025Updated last year
- LLaSE: Maximizing Acoustic Preservation for LLaMA based Speech Enhancement☆16Jul 11, 2025Updated 8 months ago
- ☆13Oct 25, 2024Updated last year
- Diarization scoring tools.☆262Mar 28, 2023Updated 3 years ago
- ☆59Mar 28, 2025Updated last year
- ☆16Mar 7, 2019Updated 7 years ago
- neural network based speaker embedder☆25Jan 7, 2023Updated 3 years ago
- 1-Click AI Models by DigitalOcean Gradient • AdDeploy popular AI models on DigitalOcean Gradient GPU virtual machines with just a single click and start building anything your business needs.
- MagicData-RAMC Dataset and Baseline☆58Sep 13, 2022Updated 3 years ago
- Code for InterSpeech 2024 Paper: LipGER: Visually-Conditioned Generative Error Correction for Robust Automatic Speech Recognition☆19Jul 16, 2024Updated last year
- Some comprehensive papers about speaker diarization☆338Updated this week
- ☆52Oct 17, 2023Updated 2 years ago
- This repository contains the baseline system for CHiME-8 MMCSG challenge focusing on transcribing both sides of a conversation where one …☆40Mar 13, 2024Updated 2 years ago
- Development Toolkit for the VoxCeleb Speaker Recognition Challenge 2020☆43Jul 17, 2020Updated 5 years ago
- ☆24Sep 20, 2024Updated last year
- The code of the paper "Minimizing the Accumulated Trajectory Error to Improve Dataset Distillation" (CVPR2023)☆18Mar 21, 2023Updated 3 years ago
- ☆70Feb 15, 2021Updated 5 years ago
- Managed hosting for WordPress and PHP on Cloudways • AdManaged hosting with the flexibility to host WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Cloudways by DigitalOcean.
- The source code of Tim-TSENet☆15Apr 22, 2022Updated 3 years ago
- A pytorch implementation of the paper "ANSD-MA-MSE: Adaptive Neural Speaker Diarization Using Memory-Aware Multi-Speaker Embedding"☆61Sep 19, 2024Updated last year
- Automatically setup the AISHELL-4 and MSDWild dataset for usage with pyannote-database (and pyannote-audio)☆15Oct 22, 2025Updated 5 months ago
- Pytorch implementation of our paper: Audio-Visual Speech Separation with Visual Features Enhanced by Adversarial Training.☆18Jul 11, 2022Updated 3 years ago
- Clustering-based methods for overlapping diarization☆82Jan 12, 2024Updated 2 years ago
- [ICCV'21] The Right to Talk: An Audio-Visual Transformer Approach☆20Aug 2, 2021Updated 4 years ago
- End-to-End Neural Diarization☆423Aug 30, 2021Updated 4 years ago