[INTERSPEECH 2022] This dataset is designed for multi-modal speaker diarization and lip-speech synchronization in the wild.
☆64Jan 24, 2024Updated 2 years ago
Alternatives and similar repositories for MSDWILD
Users that are interested in MSDWILD are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- ☆16Feb 19, 2026Updated 3 months ago
- INTERSPEECH2023: Target Active Speaker Detection with Audio-visual Cues☆61May 29, 2023Updated 3 years ago
- ☆51Nov 24, 2022Updated 3 years ago
- Dynamic vision-guided speaker embedding for audio-visual speaker diarization☆12Jul 5, 2022Updated 3 years ago
- ☆21Nov 24, 2022Updated 3 years ago
- Deploy on Railway without the complexity - Free Credits Offer • AdConnect your repo and Railway handles the rest with instant previews. Quickly provision container image services, databases, and storage volumes.
- The repository for IEEE CVPR 2023 (A Light Weight Model for Active Speaker Detection)☆176Mar 23, 2025Updated last year
- Accepted by TMM 2022☆19Aug 18, 2022Updated 3 years ago
- ☆42Nov 22, 2024Updated last year
- ACM MM 2021: 'Is Someone Speaking? Exploring Long-term Temporal Features for Audio-visual Active Speaker Detection'☆477Oct 23, 2023Updated 2 years ago
- ☆65Jun 28, 2023Updated 2 years ago
- Audio-visual diarization pipeline used for creating VoxConverse dataset☆21Jun 6, 2025Updated last year
- Visualization tools for audio-only and multi-modal speaker diarization dataset☆13Oct 27, 2023Updated 2 years ago
- Spot the conversation: speaker diarisation in the wild☆167Jul 26, 2022Updated 3 years ago
- A CSRankings-like index for speech researchers☆35Oct 16, 2024Updated last year
- Managed Database hosting by DigitalOcean • AdPostgreSQL, MySQL, MongoDB, Kafka, Valkey, and OpenSearch available. Automatically scale up storage and focus on building your apps.
- Multi-Stage Face-Voice Association Learning with Keynote Speaker Diarization (ACM MM 2024)☆22Jul 25, 2024Updated last year
- ☆20Mar 20, 2026Updated 2 months ago
- Once more Diarization: Improving meeting transcription systems through segment-level speaker reassignment☆14Feb 5, 2025Updated last year
- LLaSE: Maximizing Acoustic Preservation for LLaMA based Speech Enhancement☆16Jul 11, 2025Updated 11 months ago
- ☆14Oct 25, 2024Updated last year
- Diarization scoring tools.☆267Apr 8, 2026Updated 2 months ago
- ☆59Mar 28, 2025Updated last year
- ☆16Mar 7, 2019Updated 7 years ago
- neural network based speaker embedder☆24Jan 7, 2023Updated 3 years ago
- Managed hosting for WordPress and PHP on Cloudways • AdManaged hosting for WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Deploy in minutes on Cloudways by DigitalOcean.
- MagicData-RAMC Dataset and Baseline☆64Sep 13, 2022Updated 3 years ago
- Code for InterSpeech 2024 Paper: LipGER: Visually-Conditioned Generative Error Correction for Robust Automatic Speech Recognition☆19Jul 16, 2024Updated last year
- Some comprehensive papers about speaker diarization☆362Mar 24, 2026Updated 2 months ago
- ☆55Oct 17, 2023Updated 2 years ago
- Development Toolkit for the VoxCeleb Speaker Recognition Challenge 2020☆43Jul 17, 2020Updated 5 years ago
- ☆24Sep 20, 2024Updated last year
- This repository contains the baseline system for CHiME-8 MMCSG challenge focusing on transcribing both sides of a conversation where one …☆41Mar 13, 2024Updated 2 years ago
- The code of the paper "Minimizing the Accumulated Trajectory Error to Improve Dataset Distillation" (CVPR2023)☆18Mar 21, 2023Updated 3 years ago
- ☆72Feb 15, 2021Updated 5 years ago
- 1-Click AI Models by DigitalOcean Gradient • AdDeploy popular AI models on DigitalOcean Gradient GPU virtual machines with just a single click. Zero configuration with optimized deployments.
- The source code of Tim-TSENet☆15Apr 22, 2022Updated 4 years ago
- Automatically setup the AISHELL-4 and MSDWild dataset for usage with pyannote-database (and pyannote-audio)☆15Oct 22, 2025Updated 7 months ago
- A pytorch implementation of the paper "ANSD-MA-MSE: Adaptive Neural Speaker Diarization Using Memory-Aware Multi-Speaker Embedding"☆62Sep 19, 2024Updated last year
- Pytorch implementation of our paper: Audio-Visual Speech Separation with Visual Features Enhanced by Adversarial Training.☆18Jul 11, 2022Updated 3 years ago
- Clustering-based methods for overlapping diarization☆85Jan 12, 2024Updated 2 years ago
- [ICCV'21] The Right to Talk: An Audio-Visual Transformer Approach☆20Aug 2, 2021Updated 4 years ago
- End-to-End Neural Diarization☆434Aug 30, 2021Updated 4 years ago