facebookresearch / MMCSG
This repository contains the baseline system for CHiME-8 MMCSG challenge focusing on transcribing both sides of a conversation where one participant is wearing smart glasses equipped with a microphone array and camera.
☆32Updated last year
Alternatives and similar repositories for MMCSG:
Users that are interested in MMCSG are comparing it to the libraries listed below
- (Interspeech 2023 & ICASSP 2024) Official repository for ARMHuBERT and STaRHuBERT☆39Updated 7 months ago
- Multipurpose Multi Speaker Mixture Signal Generator☆44Updated 2 months ago
- ☆30Updated last year
- ☆30Updated 5 months ago
- Code of the paper "Low-Latency Speech Separation Guided Diarization for Telephone Conversations"☆13Updated 2 years ago
- ☆43Updated last year
- ICASSP2025Dynamic Embedding Causal Target Speech Extraction☆2Updated last month
- ☆36Updated 2 years ago
- Balanced Error Rate for Speaker Diarization☆30Updated 2 years ago
- Word Discovery in Visually Grounded, Self-Supervised Speech Models☆26Updated last year
- DUSTED: Spoken-Term Discovery using Discrete Speech Units☆17Updated 6 months ago
- A toolkit dedicate for speech evaluation.☆19Updated 7 months ago
- ☆25Updated 5 months ago
- Typing to Listen at the Cocktail Party: Text-Guided Target Speaker Extraction (LLM-TSE)☆41Updated last year
- ☆29Updated 2 years ago
- A Pytorch implementation of the paper : SpecAugment++: A Hidden Space Data Augmentation Method for Acoustic Scene Classification☆31Updated 3 years ago
- ADAPTING SELF-SUPERVISED MODELS TO MULTI-TALKER SPEECH RECOGNITION USING SPEAKER EMBEDDINGS☆28Updated 2 years ago
- Unsupervised phone and word segmentation using dynamic programming on self-supervised VQ features.☆37Updated last year
- ☆48Updated 4 months ago
- Code for InterSpeech 2024 Paper: LipGER: Visually-Conditioned Generative Error Correction for Robust Automatic Speech Recognition☆16Updated 9 months ago
- Source code and demo for INTERPSEECH 2023 paper: DuTa-VC: A Duration-aware Typical-to-atypical Voice Conversion Approach with Diffusion P…☆36Updated last year
- COG-MHEAR Audio-Visual Speech Enhancement Challenge☆39Updated 3 weeks ago
- [SLT'24] Mamba-based Decoder-Only Approach for Speech Recognition☆12Updated 4 months ago
- Official implementation of "PhonMatchNet: Phoneme-Guided Zero-Shot Keyword Spotting for User-Defined Keywords" (INTERSPEECH 2023)☆46Updated 10 months ago
- PyTorch implementation of WASE described in our ICASSP 2021: "Wase: Learning When to Attend for Speaker Extraction in Cocktail Party Envi…☆25Updated 3 years ago
- A list of papers for child ASR☆39Updated 6 months ago
- AudioVisual Diarization - Supervised and Unsupervised☆14Updated 2 years ago
- ☆29Updated 2 years ago
- ☆33Updated 4 years ago
- Python scripts to create noisy and reverberant 2-speaker mixture audio with Libri-Light and WHAM☆16Updated 5 months ago