AudenAI/Auden

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/AudenAI/Auden)

AudenAI / Auden

☆71

Alternatives and similar repositories for Auden

Users that are interested in Auden are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

ASLP-lab / FastTurn
View on GitHub
☆33May 19, 2026Updated 2 months ago
ASLP-lab / Speaker-Reasoner
View on GitHub
Speaker-Reasoner: Scaling Interaction Turns and Reasoning Patterns for Timestamped Speaker-Attributed ASR
☆93May 13, 2026Updated 2 months ago
BUTSpeechFIT / SOT-DiCoW
View on GitHub
Multi-talker ASR based on DiCoW with Serialized Output Training
☆20Sep 18, 2025Updated 10 months ago
Audio-Reasoning-Challenge / Audio-Reasoning-Challenge-Baselines
View on GitHub
The baselines of ARC-Challenge-Interspeech2026
☆60Dec 1, 2025Updated 7 months ago
ASLP-lab / SenSE
View on GitHub
Official code of SenSE.
☆90Oct 30, 2025Updated 8 months ago
Deploy on Railway without the complexity - Free Credits Offer • Ad
Connect your repo and Railway handles the rest with instant previews. Quickly provision container image services, databases, and storage volumes.
mubingshen / MLC-SLM-Baseline
View on GitHub
The project is associated with the recently-launched INTERSPEECH 2025 Workshop on Multilingual Conversational Speech Language Model (MLC-…
☆51May 14, 2025Updated last year
BUTSpeechFIT / mt-asr-data-prep
View on GitHub
☆25Feb 26, 2026Updated 4 months ago
Audio-WestlakeU / FS-EEND
View on GitHub
The official Pytorch implementation of "Frame-wise streaming end-to-end speaker diarization with non-autoregressive self-attention-based …
☆183May 7, 2026Updated 2 months ago
BUTSpeechFIT / DiariZen
View on GitHub
A toolkit for speaker diarization.
☆504May 29, 2026Updated last month
nikhilraghav29 / diarizen-tutorial
View on GitHub
DiariZen Explained: A Tutorial for the Open Source State-of-the-Art Speaker Diarization Pipeline.
☆22Apr 24, 2026Updated 3 months ago
Audio-WestlakeU / CleanMel
View on GitHub
Pytorch implementation of "CleanMel: Mel-Spectrogram Enhancement for Improving Both Speech Quality and ASR".
☆94Feb 2, 2026Updated 5 months ago
XiaomiMiMo / MiMo-Audio-Training
View on GitHub
☆109Oct 16, 2025Updated 9 months ago
ASLP-lab / Smart-Glass-Challenge
View on GitHub
☆17Jun 16, 2026Updated last month
facebookresearch / MMCSG
View on GitHub
This repository contains the baseline system for CHiME-8 MMCSG challenge focusing on transcribing both sides of a conversation where one …
☆41Mar 13, 2024Updated 2 years ago
Proton VPN Special Offer - Get 70% off • Ad
Special partner offer. Trusted by over 100 million users worldwide. Tested, Approved and Recommended by Experts.
Maokui-He / NSD-MA-MSE
View on GitHub
A pytorch implementation of the paper "ANSD-MA-MSE: Adaptive Neural Speaker Diarization Using Memory-Aware Multi-Speaker Embedding"
☆62Sep 19, 2024Updated last year
nttcslab-sp / mamba-diarization
View on GitHub
Official repository for Mamba-based Segmentation Model for Speaker Diarization
☆47May 13, 2025Updated last year
merlresearch / sebbs
View on GitHub
Prediction of sound event bounding boxes (SEBBs)
☆35Aug 2, 2024Updated last year
Soul-AILab / SoulX-Transcriber
View on GitHub
An end-to-end framework for multi-speaker transcription that jointly models who spoke, when, and what.
☆284Jun 22, 2026Updated last month
lucadellalib / discrete-wavlm-codec
View on GitHub
A neural speech codec based on discrete WavLM representations
☆26Aug 28, 2024Updated last year
ASLP-lab / Easy-Turn
View on GitHub
Open-Source Turn-Taking Detection Model and Dataset for Full-Duplex Spoken Dialogue Systems
☆122Jan 25, 2026Updated 6 months ago
ASLP-lab / M7-TTS
View on GitHub
M7-TTS: A Mini-Scale Multilingual and Multi-Dialect Text-to-Speech Language Model with Mimi codec and Multi Token Prediction
☆20Mar 19, 2026Updated 4 months ago
bigai-nlco / UltraVoice
View on GitHub
Official Repository of UltraVoice
☆62Oct 28, 2025Updated 8 months ago
Kevin-naticl / LLaSE-G1
View on GitHub
LLaSE-G1: Incentivizing Generalization Capability for LLaMA-based Speech Enhancement
☆105Apr 1, 2025Updated last year
Deploy on Railway without the complexity - Free Credits Offer • Ad
Connect your repo and Railway handles the rest with instant previews. Quickly provision container image services, databases, and storage volumes.
liutaocode / AwesomeDiarizationDataset
View on GitHub
Both audio-only and audio-visual speaker diarization datasets are listed here.
☆16Feb 22, 2023Updated 3 years ago
NKU-HLT / DIFFA
View on GitHub
[AAAI 2026 & ACL 2026] The official implementation of the DIFFA series for dLLM-based large audio language model
☆83Apr 7, 2026Updated 3 months ago
kaistmm / seed-pytorch
View on GitHub
[INTERSPEECH 2025] Official code for "SEED: Speaker Embedding Enhancement Diffusion Model"
☆59Nov 3, 2025Updated 8 months ago
ASLP-lab / OSUM-Pangu
View on GitHub
An Open-Source Multidimension Speech Understanding Foundation Model Built upon OpenPangu on Ascend NPUs
☆33Mar 15, 2026Updated 4 months ago
zhu-han / SpeechLLM
View on GitHub
LLM-based ASR recipe with Zipformer encoder and Qwen LLM
☆35Sep 25, 2025Updated 10 months ago
yufan-aslp / AliMeeting
View on GitHub
The project is associated with the recently-launched ICASSP 2022 Multi-channel Multi-party Meeting Transcription Challenge (M2MeT) to pro…
☆142Jun 10, 2022Updated 4 years ago
Kevin-naticl / LLaSE
View on GitHub
LLaSE: Maximizing Acoustic Preservation for LLaMA based Speech Enhancement
☆16Jul 11, 2025Updated last year
wenet-e2e / west
View on GitHub
We Speech Toolkit, LLM based Speech Toolkit for Speech Understanding, Generation, and Interaction
☆206Jul 17, 2026Updated last week
BUTSpeechFIT / TS-ASR-Whisper
View on GitHub
☆116Jun 29, 2026Updated 3 weeks ago
GPU virtual machines on DigitalOcean Gradient AI • Ad
Get to production fast with high-performance AMD and NVIDIA GPUs you can spin up in seconds. The definition of operational simplicity.
MCoRec / mcorec_baseline
View on GitHub
CHiME-9 Task 1 - MCoRec baseline
☆28Jan 13, 2026Updated 6 months ago
VoxBlink2 / ScriptsForVoxBlink2
View on GitHub
Official Repository For VoxBlink2
☆88Aug 13, 2024Updated last year
lmxue / NVV-SuperBench
View on GitHub
NVV-SuperBench: Beyond Words, Beyond Quality—Benchmarking Nonverbal Vocalizations in Speech Generation (Interspeech 2026 long paper)
☆18Jun 21, 2026Updated last month
joonaskalda / PixIT
View on GitHub
Companion repo for the paper "PixIT: Joint Training of Speaker Diarization and Speech Separation from Real-world Multi-speaker Recordings…
☆105Jan 10, 2025Updated last year
Deep-unlearning / Llasa-GRPO
View on GitHub
☆18Nov 19, 2025Updated 8 months ago
OpenMOSS / MOSS-Speech
View on GitHub
MOSS-Speech is a true speech-to-speech large language model without text guidance.
☆138Feb 13, 2026Updated 5 months ago
Xflick / EEND_PyTorch
View on GitHub
A PyTorch implementation of End-to-End Neural Diarization
☆110Jun 19, 2023Updated 3 years ago