facebookresearch/MMCSG

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/facebookresearch/MMCSG)

facebookresearch / MMCSG

This repository contains the baseline system for CHiME-8 MMCSG challenge focusing on transcribing both sides of a conversation where one participant is wearing smart glasses equipped with a microphone array and camera.

☆41

Alternatives and similar repositories for MMCSG

Users that are interested in MMCSG are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

MagicHub-io / MagicData-RAMC
View on GitHub
MagicData-RAMC Dataset and Baseline
☆64Sep 13, 2022Updated 3 years ago
facebookresearch / EasyComDataset
View on GitHub
The Easy Communications (EasyCom) dataset is a world-first dataset designed to help mitigate the *cocktail party effect* from an augmente…
☆143Dec 4, 2023Updated 2 years ago
microsoft / NOTSOFAR1-Challenge
View on GitHub
NOTSOFAR-1 Challenge: Distant Diarization and ASR
☆65Feb 12, 2025Updated last year
k2-fsa / kaldi-decoder
View on GitHub
Decoders from Kaldi using OpenFst
☆35Apr 10, 2026Updated 3 months ago
nwpuaslp / ASC_baseline
View on GitHub
☆20Nov 22, 2020Updated 5 years ago
1-Click AI Models by DigitalOcean Gradient • Ad
Deploy popular AI models on DigitalOcean Gradient GPU virtual machines with just a single click. Zero configuration with optimized deployments.
mispchallenge / misp2022_baseline
View on GitHub
☆33Jun 26, 2023Updated 3 years ago
lucadellalib / discrete-wavlm-codec
View on GitHub
A neural speech codec based on discrete WavLM representations
☆26Aug 28, 2024Updated last year
desh2608 / gss
View on GitHub
A simple package for Guided source separation (GSS)
☆134May 20, 2024Updated 2 years ago
yufan-aslp / AliMeeting
View on GitHub
The project is associated with the recently-launched ICASSP 2022 Multi-channel Multi-party Meeting Transcription Challenge (M2MeT) to pro…
☆142Jun 10, 2022Updated 4 years ago
AudenAI / Auden
View on GitHub
☆71Apr 2, 2026Updated 3 months ago
liyunlongaaa / NSD-MS2S
View on GitHub
CHIME-7/8 diarization champion system: neural speaker diarization using memory-aware multi-speaker embedding with sequence-to-sequence ar…
☆88Jun 17, 2025Updated last year
Maokui-He / NSD-MA-MSE
View on GitHub
A pytorch implementation of the paper "ANSD-MA-MSE: Adaptive Neural Speaker Diarization Using Memory-Aware Multi-Speaker Embedding"
☆62Sep 19, 2024Updated last year
facebookresearch / GlassesRoomID
View on GitHub
Blind Identification of Binaural Room Impulse Responses from Head-Worn Microphone Arrays
☆20Sep 18, 2024Updated last year
mubingshen / MLC-SLM-Baseline
View on GitHub
The project is associated with the recently-launched INTERSPEECH 2025 Workshop on Multilingual Conversational Speech Language Model (MLC-…
☆51May 14, 2025Updated last year
1-Click AI Models by DigitalOcean Gradient • Ad
Deploy popular AI models on DigitalOcean Gradient GPU virtual machines with just a single click. Zero configuration with optimized deployments.
merlresearch / tssep
View on GitHub
TS-SEP: Joint Diarization and Separation Conditioned on Estimated Speaker Embeddings
☆43Oct 27, 2025Updated 8 months ago
liyunlongaaa / AD-TUNING
View on GitHub
AD-TUNING: An Adaptive CHILD-TUNING Approach to Efficient Hyperparameter Optimization of Child Networks for Speech Processing Tasks in th…
☆11Feb 23, 2024Updated 2 years ago
BUTSpeechFIT / mt-asr-data-prep
View on GitHub
☆25Feb 26, 2026Updated 4 months ago
WingZLeung / TTDS
View on GitHub
Text-to-dysarthric speech (TTDS) synthesis. An implementation using the Grad-TTS model with the TORGO database.
☆13Mar 15, 2025Updated last year
chorowski-lab / CPC_audio
View on GitHub
An implementation of the Contrast Predictive Coding (CPC) method to train audio features in an unsupervised fashion.
☆10Feb 22, 2022Updated 4 years ago
chimechallenge / chime-utils
View on GitHub
Scripts for data generation, scoring and data manifest preparation for CHiME-8 DASR task.
☆26Feb 25, 2025Updated last year
usnistgov / F4DE
View on GitHub
Framework for Detection Evaluation (F4DE) : set of evaluation tools for detection evaluations and for specific NIST-coordinated evaluatio…
☆26Jul 6, 2017Updated 9 years ago
k2-fsa / text_search
View on GitHub
Some fast-ish algorithms for batch text search in moderate-sized collections, intended for data cleanup
☆79Jun 30, 2025Updated last year
Enny1991 / beamformers
View on GitHub
Easy to use Beamformers for multi-channel speech separation/enhancement
☆216Jan 26, 2021Updated 5 years ago
1-Click AI Models by DigitalOcean Gradient • Ad
Deploy popular AI models on DigitalOcean Gradient GPU virtual machines with just a single click. Zero configuration with optimized deployments.
nguyenvulebinh / AVSRCocktail
View on GitHub
Audio-Visual Speech Recognition
☆26Jul 7, 2025Updated last year
nervjack2 / MelHuBERT
View on GitHub
Official implementation of MelHuBERT
☆70Feb 21, 2026Updated 5 months ago
jasonppy / word-discovery
View on GitHub
Word Discovery in Visually Grounded, Self-Supervised Speech Models
☆27Dec 4, 2023Updated 2 years ago
jsalt2020-asrdiar / jsalt2020_simulate
View on GitHub
Training data simulation
☆60May 6, 2024Updated 2 years ago
X-LANCE / public_talks
View on GitHub
Materials of public talks given By SJTU X-LANCE members
☆14Dec 3, 2022Updated 3 years ago
desh2608 / dover-lap
View on GitHub
Python package for combining diarization system outputs.
☆94Oct 12, 2023Updated 2 years ago
fgnt / pb_chime5
View on GitHub
Speech enhancement system for the CHiME-5 dinner party scenario
☆110Feb 6, 2025Updated last year
haiciyang / Remixing
View on GitHub
Official repo of ICASSP 2022 paper - Don't Separate, Learn to Remix: End-to-End Neural Remixing with Joint Optimization
☆20Jan 7, 2025Updated last year
speechpro / mixup
View on GitHub
☆24Mar 13, 2020Updated 6 years ago
Deploy to Railway using AI coding agents - Free Credits Offer • Ad
Use Claude Code, Codex, OpenCode, and more. Autonomous software development now has the infrastructure to match with Railway.
Berkeley-Speech-Group / DysfluentWFST
View on GitHub
DysfluentWFST
☆19Nov 13, 2025Updated 8 months ago
nikhilraghav29 / diarizen-tutorial
View on GitHub
DiariZen Explained: A Tutorial for the Open Source State-of-the-Art Speaker Diarization Pipeline.
☆21Apr 24, 2026Updated 2 months ago
kuan2jiu99 / Awesome-Speech-Generation
View on GitHub
Survey on speech generation work.
☆21Nov 26, 2023Updated 2 years ago
X-LANCE / MSDWILD
View on GitHub
[INTERSPEECH 2022] This dataset is designed for multi-modal speaker diarization and lip-speech synchronization in the wild.
☆65Jan 24, 2024Updated 2 years ago
Kevin-naticl / LLaSE
View on GitHub
LLaSE: Maximizing Acoustic Preservation for LLaMA based Speech Enhancement
☆16Jul 11, 2025Updated last year
X-LANCE / BER
View on GitHub
Balanced Error Rate for Speaker Diarization
☆32Feb 28, 2023Updated 3 years ago
dihardchallenge / dihard3_baseline
View on GitHub
☆30Jul 21, 2022Updated 4 years ago