facebookresearch / MMCSG
This repository contains the baseline system for CHiME-8 MMCSG challenge focusing on transcribing both sides of a conversation where one participant is wearing smart glasses equipped with a microphone array and camera.
☆31Updated 11 months ago
Alternatives and similar repositories for MMCSG:
Users that are interested in MMCSG are comparing it to the libraries listed below
- Multipurpose Multi Speaker Mixture Signal Generator☆44Updated last month
- ☆30Updated last year
- ☆29Updated 2 years ago
- (Interspeech 2023 & ICASSP 2024) Official repository for ARMHuBERT and STaRHuBERT☆39Updated 6 months ago
- Code of the paper "Low-Latency Speech Separation Guided Diarization for Telephone Conversations"☆13Updated 2 years ago
- ☆43Updated last year
- Typing to Listen at the Cocktail Party: Text-Guided Target Speaker Extraction (LLM-TSE)☆40Updated last year
- ☆36Updated 2 years ago
- ☆29Updated 3 months ago
- Source code and demo for INTERPSEECH 2023 paper: DuTa-VC: A Duration-aware Typical-to-atypical Voice Conversion Approach with Diffusion P…☆36Updated last year
- Code for the Interspeech 2024 paper "MM-KWS: Multi-modal Prompts for Multilingual User-defined Keyword Spotting"☆23Updated 3 months ago
- A Pytorch implementation of the paper : SpecAugment++: A Hidden Space Data Augmentation Method for Acoustic Scene Classification☆31Updated 3 years ago
- COG-MHEAR Audio-Visual Speech Enhancement Challenge☆34Updated 11 months ago
- Generative Expressive Conversational Speech Synthesis (Accepted by MM'2024)☆53Updated 4 months ago
- Balanced Error Rate for Speaker Diarization☆29Updated 2 years ago
- ☆15Updated 2 months ago
- A collection of papers related to speech model compression☆24Updated last year
- ☆25Updated 4 months ago
- Official Code for SyllableLM: Learning Coarse Semantic Units for Speech Language Models☆47Updated this week
- A toolkit dedicate for speech evaluation.☆19Updated 5 months ago
- ☆26Updated last year
- ADAPTING SELF-SUPERVISED MODELS TO MULTI-TALKER SPEECH RECOGNITION USING SPEAKER EMBEDDINGS☆28Updated last year
- [SLT'24] Mamba-based Decoder-Only Approach for Speech Recognition☆11Updated 3 months ago
- Streaming Audiotransformers for online Audio tagging☆43Updated 8 months ago
- Pytorch implementation of our paper: Audio-Visual Speech Separation with Visual Features Enhanced by Adversarial Training.☆17Updated 2 years ago
- DUSTED: Spoken-Term Discovery using Discrete Speech Units☆15Updated 5 months ago
- A list of papers for child ASR☆37Updated 4 months ago
- Implementation of the paper "Self-supervised Learning with Random-projection Quantizer for Speech Recognition" in Pytorch.☆70Updated last year
- Implementation of CoBERT: Self-Supervised Speech Representation Learning Through Code Representation Learning☆46Updated last year