MCoRec/mcorec_baseline

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/MCoRec/mcorec_baseline)

MCoRec / mcorec_baseline

CHiME-9 Task 1 - MCoRec baseline

☆28

Alternatives and similar repositories for mcorec_baseline

Users that are interested in mcorec_baseline are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

nguyenvulebinh / AVSRCocktail
View on GitHub
Audio-Visual Speech Recognition
☆26Jul 7, 2025Updated last year
nguyenvulebinh / AV-HuBERT-S2S
View on GitHub
Huggingface Implementation of AV-HuBERT on the MuAViC Dataset
☆19Mar 6, 2025Updated last year
REAL-TSE / REAL-TSE-Challenge
View on GitHub
☆33Jun 1, 2026Updated last month
BUTSpeechFIT / mt-asr-data-prep
View on GitHub
☆25Feb 26, 2026Updated 4 months ago
BUTSpeechFIT / SOT-DiCoW
View on GitHub
Multi-talker ASR based on DiCoW with Serialized Output Training
☆20Sep 18, 2025Updated 10 months ago
Deploy on Railway without the complexity - Free Credits Offer • Ad
Connect your repo and Railway handles the rest with instant previews. Quickly provision container image services, databases, and storage volumes.
merlresearch / tssep
View on GitHub
TS-SEP: Joint Diarization and Separation Conditioned on Estimated Speaker Embeddings
☆43Oct 27, 2025Updated 8 months ago
popcornell / FastMSS
View on GitHub
☆32May 18, 2026Updated 2 months ago
BUTSpeechFIT / TS-ASR-Whisper
View on GitHub
☆116Jun 29, 2026Updated 3 weeks ago
ZXHY-82 / w2v-BERT-2.0_SV
View on GitHub
☆53Mar 28, 2026Updated 3 months ago
mubingshen / MLC-SLM-Baseline
View on GitHub
The project is associated with the recently-launched INTERSPEECH 2025 Workshop on Multilingual Conversational Speech Language Model (MLC-…
☆51May 14, 2025Updated last year
BUTSpeechFIT / DiCoW
View on GitHub
☆100Jan 28, 2026Updated 5 months ago
fgnt / mms_msg
View on GitHub
Multipurpose Multi Speaker Mixture Signal Generator
☆46Feb 6, 2025Updated last year
FrenchKrab / IS2023-powerset-diarization
View on GitHub
Official repository for the "Powerset multi-class cross entropy loss for neural speaker diarization" paper published in Interspeech 2023.
☆96Oct 18, 2023Updated 2 years ago
ahaliassos / usr
View on GitHub
Official implementation of USR (NeurIPS 2024)
☆40Dec 21, 2024Updated last year
GPU virtual machines on DigitalOcean Gradient AI • Ad
Get to production fast with high-performance AMD and NVIDIA GPUs you can spin up in seconds. The definition of operational simplicity.
nikhilraghav29 / diarizen-tutorial
View on GitHub
DiariZen Explained: A Tutorial for the Open Source State-of-the-Art Speaker Diarization Pipeline.
☆22Apr 24, 2026Updated 3 months ago
joonaskalda / PixIT
View on GitHub
Companion repo for the paper "PixIT: Joint Training of Speaker Diarization and Speech Separation from Real-world Multi-speaker Recordings…
☆105Jan 10, 2025Updated last year
ASLP-lab / FMSU-Bench
View on GitHub
Towards Fine-Grained Multi-Dimensional Speech Understanding: Data Pipeline, Benchmark, and Model
☆25May 21, 2026Updated 2 months ago
ahaliassos / raven
View on GitHub
Official implementation of RAVEn (ICLR 2023) and BRAVEn (ICASSP 2024)
☆82Feb 27, 2025Updated last year
BUTSpeechFIT / vae_dolphin
View on GitHub
☆10Jan 26, 2021Updated 5 years ago
liyunlongaaa / NSD-MS2S
View on GitHub
CHIME-7/8 diarization champion system: neural speaker diarization using memory-aware multi-speaker embedding with sequence-to-sequence ar…
☆88Jun 17, 2025Updated last year
clement-pages / gryannote
View on GitHub
Provide Gradio custom components to make the diarization-based audio labeling process easier and faster.
☆71Apr 22, 2026Updated 3 months ago
joannahong / AV-RelScore
View on GitHub
Audio-Visual Corruption Modeling of our paper "Watch or Listen: Robust Audio-Visual Speech Recognition with Visual Corruption Modeling an…
☆35Jun 20, 2023Updated 3 years ago
JethroWangSir / SincQDR-VAD
View on GitHub
☆26Aug 29, 2025Updated 10 months ago
Managed Kubernetes at scale on DigitalOcean • Ad
DigitalOcean Kubernetes includes the control plane, bandwidth allowance, container registry, automatic updates, and more for free.
193746 / VHASR
View on GitHub
☆11Oct 31, 2024Updated last year
AudenAI / Auden
View on GitHub
☆71Apr 2, 2026Updated 3 months ago
wenet-e2e / wesep
View on GitHub
Target Speaker Extraction Toolkit
☆299Oct 4, 2025Updated 9 months ago
Audio-WestlakeU / FS-EEND
View on GitHub
The official Pytorch implementation of "Frame-wise streaming end-to-end speaker diarization with non-autoregressive self-attention-based …
☆183May 7, 2026Updated 2 months ago
BUTSpeechFIT / torch_msbg_mbstoi
View on GitHub
Differentiable implementation of MSBG hearing loss model and MBSTOI intelligibility metric for Clarity Enhancement challenge.
☆21Nov 19, 2021Updated 4 years ago
YasserdahouML / VSR_test_set
View on GitHub
WildVSR
☆22Dec 13, 2023Updated 2 years ago
Audio-Reasoning-Challenge / Audio-Reasoning-Challenge-Baselines
View on GitHub
The baselines of ARC-Challenge-Interspeech2026
☆60Dec 1, 2025Updated 7 months ago
merlresearch / sebbs
View on GitHub
Prediction of sound event bounding boxes (SEBBs)
☆35Aug 2, 2024Updated last year
Sindhu-Hegde / multivsr
View on GitHub
Official code for the paper "Scaling Multilingual Visual Speech Recognition"
☆20Aug 15, 2025Updated 11 months ago
Deploy open-source AI quickly and easily - Special Bonus Offer • Ad
Runpod Hub is built for open source. One-click deployment and autoscaling endpoints without provisioning your own infrastructure.
Exgc / AVMuST-TED
View on GitHub
☆24Mar 30, 2024Updated 2 years ago
BUTSpeechFIT / ASR-hybrid-decoding
View on GitHub
☆17Nov 25, 2019Updated 6 years ago
desh2608 / gss
View on GitHub
A simple package for Guided source separation (GSS)
☆134May 20, 2024Updated 2 years ago
ahmadikalkhorani / CrossNet
View on GitHub
☆36Apr 11, 2024Updated 2 years ago
llm-jp / llama-mimi
View on GitHub
Llama-Mimi is a speech language model that uses a unified tokenizer (Mimi) and a single Transformer decoder (Llama) to jointly model sequ…
☆31Sep 20, 2025Updated 10 months ago
mmmmayi / ExPO
View on GitHub
official implementation of paper ExPO: Explainable Phonetic Trait-Oriented Network for Speaker Verification
☆14Mar 14, 2025Updated last year
umbertocappellazzo / Omni-AVSR
View on GitHub
Official Pytorch implementation of "Omni-AVSR: Towards Unified Multimodal Speech Recognition with Large Language Models" [IEEE ICASSP 202…
☆38Mar 10, 2026Updated 4 months ago