DaiYvhang/AISHELL-5

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/DaiYvhang/AISHELL-5)

DaiYvhang / AISHELL-5

In-car multi-channel speech transcription system of AISHELL-5.

☆48

Alternatives and similar repositories for AISHELL-5

Users that are interested in AISHELL-5 are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

aleXiehta / AD-FlowTSE
View on GitHub
Adaptive Flow-Matching for Target Speaker Extraction
☆39Jul 13, 2026Updated 2 weeks ago
pengzhendong / streaming-tts-webui
View on GitHub
Streaming Text to Speech Web UI
☆22May 6, 2024Updated 2 years ago
hyyan2k / LiSenNet
View on GitHub
This is the official implementation of the LiSenNet
☆163Nov 15, 2024Updated last year
jhuang448 / MultilingualALT
View on GitHub
Repo of the paper "Towards Building an End-to-End Multilingual Automatic Lyrics Transcription Model""
☆15Jun 28, 2024Updated 2 years ago
microsoft / Distill-MOS
View on GitHub
Distillation of Self-Supervised Representation-Based Speech Quality Assessment
☆49May 15, 2025Updated last year
Managed Database hosting by DigitalOcean • Ad
PostgreSQL, MySQL, MongoDB, Kafka, Valkey, and OpenSearch available. Automatically scale up storage and focus on building your apps.
huangruizhe / audio
View on GitHub
Data manipulation and transformation for audio signal processing, powered by PyTorch
☆10Sep 30, 2024Updated last year
Audio-Reasoning-Challenge / Audio-Reasoning-Challenge-Baselines
View on GitHub
The baselines of ARC-Challenge-Interspeech2026
☆60Dec 1, 2025Updated 7 months ago
MrSupW / ICMC-ASR_Baseline
View on GitHub
The baseline system for the ICASSP2024 ICMC-ASR Challenge.
☆57Dec 6, 2023Updated 2 years ago
urgent-challenge / urgent2025_challenge
View on GitHub
Official data preparation and metric evaluation scripts for the Interspeech 2025 URGENT challenge.
☆85May 21, 2025Updated last year
KhanhNguyen4999 / Speech-Enhancement-CLSKD
View on GitHub
Cross-Layer Similarity Knowledge Distillation for Speech Enhancement
☆11Jun 22, 2023Updated 3 years ago
ASLP-lab / WenetSpeech-Chuan
View on GitHub
Official repository for the WenetSpeech-Chuan dataset.
☆217Jul 14, 2026Updated 2 weeks ago
lixilinx / IVA4Cocktail
View on GitHub
Neural network density models for speech separation.
☆20Nov 26, 2020Updated 5 years ago
yongaifadian1 / MNV-17
View on GitHub
Qwen2.5-Omni fine-tuned on MNV-17 dataset for nonverbal vocalization recognition
☆31Nov 13, 2025Updated 8 months ago
Picovoice / text-to-speech-benchmark
View on GitHub
Text-to-Speech Benchmark
☆26Apr 2, 2026Updated 3 months ago
1-Click AI Models by DigitalOcean Gradient • Ad
Deploy popular AI models on DigitalOcean Gradient GPU virtual machines with just a single click. Zero configuration with optimized deployments.
ASLP-lab / Speaker-Reasoner
View on GitHub
Speaker-Reasoner: Scaling Interaction Turns and Reasoning Patterns for Timestamped Speaker-Attributed ASR
☆93May 13, 2026Updated 2 months ago
xcc-zach / xtalk
View on GitHub
X-Talk is an open-source full-duplex cascaded spoken dialogue system framework enabling low-latency, interruptible, and human-like speech…
☆233Updated this week
ASLP-lab / WenetSpeech-Yue
View on GitHub
A Large-scale Cantonese Speech Corpus with Multi-dimensional Annotation
☆345Jun 6, 2026Updated last month
FYJNEVERFOLLOWS / LaBNet
View on GitHub
Official PyTorch implementation of the Interspeech 2023 paper
☆29Jul 5, 2023Updated 3 years ago
llm-lab-org / CLASP
View on GitHub
CLASP: Contrastive Language-Speech Pretraining for Multilingual Multimodal Information Retrieval
☆13Jun 27, 2025Updated last year
ozspeech / OZSpeech
View on GitHub
[ACL 2025] OZSpeech: One-step Zero-shot Speech Synthesis with Learned-Prior-Conditioned Flow Matching
☆45Feb 9, 2025Updated last year
pengzhendong / wavesurfer
View on GitHub
For audio visualization and playback in Jupyter notebooks.
☆18Nov 25, 2025Updated 8 months ago
seorim0 / SE-using-SRL-Model
View on GitHub
Causal Speech Enhancement Based on a Two-Branch Nested U-Net Architecture Using Self-Supervised Speech Embeddings
☆21Jun 6, 2025Updated last year
urgent-challenge / urgent2024_challenge
View on GitHub
Official data preparation scripts for the URGENT 2024 Challenge
☆90May 21, 2025Updated last year
Proton VPN Special Offer - Get 70% off • Ad
Special partner offer. Trusted by over 100 million users worldwide. Tested, Approved and Recommended by Experts.
REAL-TSE / REAL-TSE-Challenge
View on GitHub
☆35Jun 1, 2026Updated last month
ZhongYang2026 / Sandglasset-A-Light-Multi-Granularity-Self-Attentive-Network-For-Time-Domain-Speech-Separation
View on GitHub
Speech Separation
☆21Mar 7, 2024Updated 2 years ago
wenet-e2e / wesep
View on GitHub
Target Speaker Extraction Toolkit
☆300Oct 4, 2025Updated 9 months ago
pengzhendong / audiolab
View on GitHub
A streaming audio reader, processor, and writer built on top of soundfile, and PyAV (bindings for FFmpeg)
☆39Mar 31, 2026Updated 3 months ago
zqwang7 / CausalityCheck
View on GitHub
Causality Check in Frame-online Speech Separation
☆51Dec 11, 2022Updated 3 years ago
SoulProficiency / speechseparation-Sandglasset
View on GitHub
☆13Jun 24, 2021Updated 5 years ago
ArenAcikgoz / Whisper-Alignment
View on GitHub
Forced alignment decoder for Whisper.
☆16Mar 13, 2024Updated 2 years ago
HuangZikang-TJU / Aug4TSE
View on GitHub
☆15Sep 16, 2024Updated last year
ASLP-lab / FastTurn
View on GitHub
☆35May 19, 2026Updated 2 months ago
Deploy on Railway without the complexity - Free Credits Offer • Ad
Connect your repo and Railway handles the rest with instant previews. Quickly provision container image services, databases, and storage volumes.
Mddct / transformer-vocos
View on GitHub
☆35Sep 6, 2025Updated 10 months ago
Alittleegg / Eureka-Audio
View on GitHub
Eureka-Audio: A 1.7B lightweight audio–language model that matches 7B–30B models on ASR, audio understanding, and paralinguistic reasonin…
☆40Apr 11, 2026Updated 3 months ago
ZLiNJU / AFC-SPEX
View on GitHub
Source code and audio samples for AFC-SPEX, an algorithm that can jointly perform acoustic feedback cancellation and speaker extraction.
☆40Nov 7, 2025Updated 8 months ago
gemengtju / SpEx_Plus
View on GitHub
SpEx+(tied) source code
☆96Jul 6, 2023Updated 3 years ago
narrietal / Fast-ULCNet
View on GitHub
Official repository of Fast-ULCNet.
☆39Jun 17, 2026Updated last month
xiaoxue1117 / speech-mamba-public
View on GitHub
☆15Nov 26, 2024Updated last year
xingchensong / TouchNet
View on GitHub
A native-PyTorch library for large scale M-LLM (text/audio) training with tp/cp/dp.
☆233Jul 2, 2026Updated 3 weeks ago