FireRedTeam/FireRedASR

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/FireRedTeam/FireRedASR)

FireRedTeam / FireRedASR

Open-source industrial-grade ASR models supporting Mandarin, Chinese dialects and English, achieving a new SOTA on public Mandarin ASR benchmarks, while also offering outstanding singing lyrics recognition capability.

☆1,936

Alternatives and similar repositories for FireRedASR

Users that are interested in FireRedASR are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

ASLP-lab / OSUM
View on GitHub
OSUM & OSUM-EChat, open speech understanding model and empathetic spoken chatbot based on it, open-sourced by ASLP@NPU.
☆494Nov 23, 2025Updated 7 months ago
wenet-e2e / west
View on GitHub
We Speech Toolkit, LLM based Speech Toolkit for Speech Understanding, Generation, and Interaction
☆206Updated this week
FireRedTeam / FireRedASR2S
View on GitHub
A SOTA Industrial-Grade All-in-One ASR system with ASR, VAD, LID, and Punc modules. FireRedASR2 supports Chinese (Mandarin, 20+ dialects/…
☆605Jun 2, 2026Updated last month
FireRedTeam / FireRedTTS
View on GitHub
An Open-Sourced LLM-empowered Foundation TTS System
☆907Sep 28, 2025Updated 9 months ago
modelscope / FunASR
View on GitHub
Open-source speech recognition toolkit for training, inference, streaming ASR, VAD, punctuation, speaker diarization pipelines, and OpenA…
☆19,333Updated this week
1-Click AI Models by DigitalOcean Gradient • Ad
Deploy popular AI models on DigitalOcean Gradient GPU virtual machines with just a single click. Zero configuration with optimized deployments.
DataoceanAI / Dolphin
View on GitHub
Dolphin is a multilingual, multitask ASR model jointly trained by DataoceanAI and Tsinghua University.
☆771Jun 11, 2026Updated last month
FunAudioLLM / SenseVoice
View on GitHub
Open-source SenseVoiceSmall model for Mandarin, Cantonese, English, Japanese, and Korean ASR, language ID, emotion recognition, and audio…
☆8,888Updated this week
wenet-e2e / wenet
View on GitHub
Production First and Production Ready End-to-End Speech Recognition Toolkit
☆5,170Jun 15, 2026Updated last month
xingchensong / TouchNet
View on GitHub
A native-PyTorch library for large scale M-LLM (text/audio) training with tp/cp/dp.
☆232Jul 2, 2026Updated 2 weeks ago
X-LANCE / SLAM-LLM
View on GitHub
A Framework for Speech, Language, Audio, Music Processing with Large Language Model
☆1,047Jan 15, 2026Updated 6 months ago
FunAudioLLM / Fun-ASR
View on GitHub
Open-source LLM-based ASR model family for Chinese, dialect, accent, and multilingual speech, with FunASR, vLLM, streaming, and llama.cpp…
☆1,410Updated this week
MrSupW / ContextASR-Bench
View on GitHub
A Massive Contextual Speech Recognition Benchmark.
☆107Aug 6, 2025Updated 11 months ago
wenet-e2e / WeTextProcessing
View on GitHub
Text Normalization & Inverse Text Normalization
☆800Jun 26, 2026Updated 3 weeks ago
modelscope / 3D-Speaker
View on GitHub
A Repository for Single- and Multi-modal Speaker Verification, Speaker Recognition and Speaker Diarization
☆3,055Dec 8, 2025Updated 7 months ago
AI Agents on DigitalOcean Gradient AI Platform • Ad
Build production-ready AI agents using customizable tools or access multiple LLMs through a single endpoint. Create custom knowledge bases or connect external data.
xphh / fireredasr-streaming
View on GitHub
low-latency realtime ASR based on FireRedASR
☆62Jul 8, 2025Updated last year
xiaomi-research / r1-aqa
View on GitHub
🤗 R1-AQA Model: mispeech/r1-aqa
☆325Mar 28, 2025Updated last year
Tele-AI / TeleSpeech-ASR
View on GitHub
☆855Jun 7, 2024Updated 2 years ago
TEN-framework / ten-vad
View on GitHub
Voice Activity Detector (VAD) : low-latency, high-performance and lightweight
☆2,198Feb 2, 2026Updated 5 months ago
pengzhendong / streaming-sensevoice
View on GitHub
Pseudo Streaming SenseVoice with Hotwords
☆465Jun 15, 2026Updated last month
xingchensong / S3Tokenizer
View on GitHub
Reverse Engineering of Supervised Semantic Speech Tokenizer (S3Tokenizer) proposed in CosyVoice
☆517Dec 22, 2025Updated 6 months ago
stepfun-ai / Step-Audio2
View on GitHub
Step-Audio 2 is an end-to-end multi-modal large language model designed for industry-strength audio understanding and speech conversation…
☆1,482Mar 16, 2026Updated 4 months ago
Slyne / ctc_decoder
View on GitHub
A ctc decoder for both online and offline asr model
☆66Nov 18, 2023Updated 2 years ago
baichuan-inc / Baichuan-Audio
View on GitHub
Baichuan-Audio: A Unified Framework for End-to-End Speech Interaction
☆223Feb 28, 2025Updated last year
Wordpress hosting with auto-scaling - Free Trial Offer • Ad
Fully Managed hosting for WordPress and WooCommerce businesses that need reliable, auto-scalable performance. Cloudways SafeUpdates now available.
modelscope / ClearerVoice-Studio
View on GitHub
An AI-Powered Speech Processing Toolkit and Open Source SOTA Pretrained Models, Supporting Speech Enhancement, Separation, and Target Spe…
☆4,314Aug 14, 2025Updated 11 months ago
QwenLM / Qwen3-ASR
View on GitHub
Qwen3-ASR is an open-source series of ASR models developed by the Qwen team at Alibaba Cloud, supporting stable multilingual speech/music…
☆3,176Jun 26, 2026Updated 3 weeks ago
xingchensong / FlashCosyVoice
View on GitHub
FlashCosyVoice: A lightweight vLLM implementation built from scratch for CosyVoice.
☆250Feb 25, 2026Updated 4 months ago
stepfun-ai / Step-Audio
View on GitHub
☆32Mar 16, 2026Updated 4 months ago
zai-org / GLM-4-Voice
View on GitHub
GLM-4-Voice | 端到端中英语音对话模型
☆3,204Dec 5, 2024Updated last year
FireRedTeam / FireRedTTS2
View on GitHub
Long-form streaming TTS system for multi-speaker dialogue generation
☆1,412Oct 26, 2025Updated 8 months ago
jishengpeng / WavChat
View on GitHub
A Survey of Spoken Dialogue Models (60 pages)
☆316Nov 28, 2024Updated last year
XiaomiMiMo / MiMo-Audio
View on GitHub
MiMo-Audio: Audio Language Models are Few-Shot Learners
☆1,063Jun 17, 2026Updated last month
QwenLM / Qwen2-Audio
View on GitHub
The official repo of Qwen2-Audio chat & pretrained large audio language model proposed by Alibaba Cloud.
☆2,088Apr 21, 2025Updated last year
Virtual machines for every use case on DigitalOcean • Ad
Get dependable uptime with 99.99% SLA, simple security tools, and predictable monthly pricing with DigitalOcean's virtual machines, called Droplets.
inclusionAI / Ming-UniAudio
View on GitHub
Ming-UniAudio: Speech LLM for Joint Understanding, Generation and Editing with Unified Representation
☆450Nov 27, 2025Updated 7 months ago
FunAudioLLM / CosyVoice
View on GitHub
Multi-lingual large voice generation model, providing inference, training and deployment full-stack ability.
☆22,265May 25, 2026Updated last month
QwenLM / Qwen-Audio
View on GitHub
The official repo of Qwen-Audio (通义千问-Audio) chat & pretrained large audio language model proposed by Alibaba Cloud.
☆1,913Jul 5, 2024Updated 2 years ago
wenet-e2e / wespeaker
View on GitHub
Research and Production Oriented Speaker Verification, Recognition and Diarization Toolkit
☆1,358Jul 8, 2026Updated last week
MoonshotAI / Kimi-Audio
View on GitHub
Kimi-Audio, an open-source audio foundation model excelling in audio understanding, generation, and conversation
☆4,677Jun 21, 2025Updated last year
FireRedTeam / FireRedChat
View on GitHub
A Fully Self-Hosted Solution for Full-Duplex Voice Interaction
☆568Sep 28, 2025Updated 9 months ago
k2-fsa / icefall
View on GitHub
☆1,453Updated this week