yuekaizhang / Fun-ASR-vllmLinks
Fun-ASR is an end-to-end speech recognition large model launched by Tongyi Lab.
☆39Updated this week
Alternatives and similar repositories for Fun-ASR-vllm
Users that are interested in Fun-ASR-vllm are comparing it to the libraries listed below
Sorting:
- A ctc decoder for both online and offline asr model☆66Updated 2 years ago
- X-Talk is an open-source full-duplex cascaded spoken dialogue system framework enabling low-latency, interruptible, and human-like speech…☆148Updated last week
- CTC decoder with hotwords for ASR.☆34Updated 9 months ago
- ☆13Updated 2 years ago
- CosyVoice_DPO_NOTES: Supercharge Your Cosyvoice model with Cutting-Edge DPO Fine-Tuning!☆110Updated 5 months ago
- Grapheme-to-Phoneme for Mixed Chinese (Mandarin or Cantonese) and English.☆113Updated last month
- Open-Source Turn-Taking Detection Model and Dataset for Full-Duplex Spoken Dialogue Systems☆67Updated 3 months ago
- Huawei Grad-TTS for Chinese☆50Updated 2 years ago
- [ICASSP2023] Source code, model links and open test sets for paper SeACo-Paraformer.☆38Updated last year
- Streaming Text to Speech Web UI☆22Updated last year
- Python Wrapper of Silero VAD☆64Updated 8 months ago
- ☆33Updated 4 years ago
- One command to build TLG.fst for WeNet.☆30Updated 3 years ago
- Inference code for Audiodec-Valle-Wenetspeech4TTS☆50Updated last year
- Torch Audio Forced Aligner for Mixed Chinese (Mandarin or Cantonese) and English.☆63Updated 4 months ago
- ☆23Updated last year
- In-car multi-channel speech transcription system of AISHELL-5.☆38Updated 7 months ago
- ☆40Updated 4 years ago
- Multi-lingual large voice generation model, providing inference, training and deployment full-stack ability.☆20Updated 9 months ago
- Python runtime for WeTextProcessing (does not depend on Pynini)☆44Updated last month
- Utilizes ONNX Runtime for audio denoising.☆107Updated 2 weeks ago
- ☆112Updated 2 months ago
- Chinese Text Normalization and Dataset☆89Updated 3 years ago
- A enterprise-grade Chinese-English code switch punctuator from funasr.☆29Updated last year
- Hybrid Flow Matching and GAN with Multi-Resolution Network for Few-Step High-Fidelity Audio Generation☆125Updated 2 weeks ago
- Optimized loss based on cross-entropy (CE), like MWER (minimum WER) Loss with beam search and negative sampling strategy, Smoothed Max Po…☆24Updated last year
- TTS FrontEnd DataSet: Polyphone / Prosody / TextNormalization☆102Updated last year
- AutoPrep: An Automatic Preprocessing Framework for In-the-Wild Speech Data☆34Updated 2 years ago
- noise reduction☆17Updated last year
- ☆61Updated 2 years ago