ictnlp / StreamSpeechLinks
StreamSpeech is an “All in One” seamless model for offline and simultaneous speech recognition, speech translation and speech synthesis.
☆1,085Updated 9 months ago
Alternatives and similar repositories for StreamSpeech
Users that are interested in StreamSpeech are comparing it to the libraries listed below
Sorting:
- Inference code for the paper "Spirit-LM Interleaved Spoken and Written Language Model".☆908Updated 7 months ago
- ☆1,321Updated 11 months ago
- An Open-Sourced LLM-empowered Foundation TTS System☆715Updated last week
- Verbatim Automatic Speech Recognition with improved word-level timestamps and filler detection☆715Updated 5 months ago
- Dolphin is a multilingual, multitask ASR model jointly trained by DataoceanAI and Tsinghua University.☆499Updated 2 weeks ago
- ☆1,128Updated 3 months ago
- Interface for OuteTTS models.☆1,283Updated this week
- ☆377Updated 3 weeks ago
- Local SRT/LLM/TTS Voicechat☆680Updated 7 months ago
- Port of Funasr's Sense-voice model in C/C++☆374Updated 3 weeks ago
- High-quality and streaming Speech-to-Speech interactive agent in a single file. 只用一个文件实现的流式全双工语音交互原型智能体!☆418Updated 2 months ago
- ☆400Updated 2 weeks ago
- PengChengStarling is specifically designed for developing multilingual ASR models based on the icefall project, supporting a complete ASR…☆181Updated 2 months ago
- Open source inference code for Rev's model☆404Updated last month
- Controllable and fast Text-to-Speech for over 7000 languages!☆1,597Updated last week
- Fuse ChatTTS with OpenVoice, upload a 10-second audio clip, and clone your personalized ChatTTS voice.☆441Updated 6 months ago
- The official repo of Qwen2-Audio chat & pretrained large audio language model proposed by Alibaba Cloud.☆1,743Updated last month
- Open-source industrial-grade ASR models supporting Mandarin, Chinese dialects and English, achieving a new SOTA on public Mandarin ASR be…☆1,018Updated 2 months ago
- VITS2: Improving Quality and Efficiency of Single-Stage Text-to-Speech with Adversarial Learning and Architecture Design☆567Updated last year
- The official repo of Qwen-Audio (通义千问-Audio) chat & pretrained large audio language model proposed by Alibaba Cloud.☆1,699Updated 10 months ago
- Omni SenseVoice: High-Speed Speech Recognition with words timestamps 🗣️🎯☆848Updated 2 months ago
- ☆359Updated 10 months ago
- OSUM: Open Speech Understanding Model, open-sourced by ASLP@NPU.☆369Updated this week
- [ICASSP 2024] 🍵 Matcha-TTS: A fast TTS architecture with conditional flow matching☆1,012Updated this week
- first base model for full-duplex conversational audio☆1,746Updated 4 months ago
- ☆701Updated 11 months ago
- 🍦 Speech-AI-Forge is a project developed around TTS generation model, implementing an API Server and a Gradio-based WebUI.☆1,243Updated last week
- Text to speech alignment using CTC forced alignment☆288Updated 2 months ago
- Next-generation TTS model using flow-matching and DiT, inspired by Stable Diffusion 3☆410Updated 8 months ago
- GPT-4o-level, real-time spoken dialogue system.☆327Updated 4 months ago