ictnlp / StreamSpeech
StreamSpeech is an “All in One” seamless model for offline and simultaneous speech recognition, speech translation and speech synthesis.
☆1,058Updated 7 months ago
Alternatives and similar repositories for StreamSpeech:
Users that are interested in StreamSpeech are comparing it to the libraries listed below
- Dolphin is a multilingual, multitask ASR model jointly trained by DataoceanAI and Tsinghua University.☆438Updated last week
- Open-source industrial-grade ASR models supporting Mandarin, Chinese dialects and English, achieving a new SOTA on public Mandarin ASR be…☆899Updated 3 weeks ago
- Inference code for the paper "Spirit-LM Interleaved Spoken and Written Language Model".☆899Updated 5 months ago
- An Open-Sourced LLM-empowered Foundation TTS System☆676Updated this week
- Open source inference code for Rev's model☆399Updated last week
- GPT-4o-level, real-time spoken dialogue system.☆314Updated 2 months ago
- ☆1,259Updated 10 months ago
- Local SRT/LLM/TTS Voicechat☆660Updated 6 months ago
- The official repo of Qwen2-Audio chat & pretrained large audio language model proposed by Alibaba Cloud.☆1,680Updated 8 months ago
- Verbatim Automatic Speech Recognition with improved word-level timestamps and filler detection☆677Updated 4 months ago
- High-quality and streaming Speech-to-Speech interactive agent in a single file. 只用一个文件实现的流式全双 工语音交互原型智能体!☆389Updated last month
- first base model for full-duplex conversational audio☆1,731Updated 3 months ago
- A toolkit for speaker diarization.☆183Updated 3 weeks ago
- Omni SenseVoice: High-Speed Speech Recognition with words timestamps 🗣️🎯☆832Updated last month
- Whisper with Medusa heads☆830Updated last month
- OSUM: Open Speech Understanding Model, open-sourced by ASLP@NPU.☆357Updated this week
- Fuse ChatTTS with OpenVoice, upload a 10-second audio clip, and clone your personalized ChatTTS voice.☆431Updated 5 months ago
- Interface for OuteTTS models.☆1,160Updated this week
- ☆159Updated 4 months ago
- The official repo of Qwen-Audio (通义千问-Audio) chat & pretrained large audio language model proposed by Alibaba Cloud.☆1,665Updated 9 months ago
- open-source multimodal large language model that can hear, talk while thinking. Featuring real-time end-to-end speech input and streaming…☆3,279Updated 5 months ago
- A Fast TTS Engine☆488Updated 2 months ago
- Pseudo Streaming SenseVoice with Hotwords☆245Updated last month
- ☆681Updated 10 months ago
- Port of Funasr's Sense-voice model in C/C++☆326Updated last week
- 🍦 Speech-AI-Forge is a project developed around TTS generation model, implementing an API Server and a Gradio-based WebUI.☆1,179Updated this week
- Controllable and fast Text-to-Speech for over 7000 languages!☆1,582Updated 5 months ago
- ☆358Updated 8 months ago
- PengChengStarling is specifically designed for developing multilingual ASR models based on the icefall project, supporting a complete ASR…☆182Updated last month
- ☆195Updated 6 months ago