opendilab / CleanS2S
High-quality and streaming Speech-to-Speech interactive agent in a single file. 只用一个文件实现的流式全双工语音交互原型智能体!
☆378Updated 3 weeks ago
Alternatives and similar repositories for CleanS2S:
Users that are interested in CleanS2S are comparing it to the libraries listed below
- ☆192Updated 6 months ago
- PengChengStarling is specifically designed for developing multilingual ASR models based on the icefall project, supporting a complete ASR…☆189Updated 3 weeks ago
- ☆218Updated last month
- ✨✨Freeze-Omni: A Smart and Low Latency Speech-to-speech Dialogue Model with Frozen LLM☆297Updated 2 months ago
- GPT-4o-level, real-time spoken dialogue system.☆300Updated 2 months ago
- Video QA Assistant based on LLMs with frame convolution☆210Updated last year
- Pseudo Streaming SenseVoice with Hotwords☆225Updated 2 weeks ago
- StreamSpeech is an “All in One” seamless model for offline and simultaneous speech recognition, speech translation and speech synthesis.☆1,043Updated 7 months ago
- Multi-Agent-GPT: 一款基于RAG和agent构建的多模态专家助手GPT。它集成了文本、图像和音频等模态工具。支持本地部署和私有数据库建设。☆219Updated last month
- Graphrag的api扩展,可通过api调用,以嵌入在自己的web服务☆121Updated 3 months ago
- 通用大模型 × 文风大模型 = 多样化风格的聊天机器人☆43Updated 8 months ago
- ☆348Updated 8 months ago
- ☆204Updated 4 months ago
- An Industrial-Level Controllable and Efficient Zero-Shot Text-To-Speech System☆277Updated last month
- Sample Repository for the AlibabaCloud Bailian Speech SDK☆135Updated this week
- 🤗 R1-AQA Model: mispeech/r1-aqa☆197Updated last week
- 基于Node.js、Vue3、uniapp的ChatGPT+智能体+Midjourney绘画+PPT生成+Suno音乐+Pika/Runway/Sora视频 网页服务 | 个人、团队、企业私有化AIGC平台☆239Updated this week
- Extension of ChatTTS, 3x Faster on Windows, Support Voice Cloning and Mobile Deployment☆141Updated last month
- ☆310Updated 3 months ago
- Baichuan-Audio: A Unified Framework for End-to-End Speech Interaction☆163Updated last month
- A package for parsing PDFs and analyzing their content using LLMs.☆266Updated 7 months ago
- We Speech Transcript based on LLM, in 300 lines of code.☆153Updated 3 weeks ago
- 使用vllm加速cosyvoice2的推理☆111Updated 2 weeks ago
- StreamingBench: Assessing the Gap for MLLMs to Achieve Streaming Video Understanding☆114Updated this week
- 实时语音交互数字人,支持端到端语音方案(GLM-4-Voice - THG)和级联方案(ASR-LLM-TTS-THG)。可自定义形象与音色,无须训练,支持音色克隆,首包延迟低至3s。Real-time voice interactive digital human, su…☆829Updated last week
- Open-source industrial-grade ASR models supporting Mandarin, Chinese dialects and English, achieving a new SOTA on public Mandarin ASR be…☆797Updated 3 weeks ago
- 实时STT,连接OpenAI接口/智谱AI(流式LLM)和GPT-SOVITS/Edge-TTS,通过网页的方式,进行跨网络的服务调用,实现实时对话的效果☆346Updated 2 months ago
- StyleLLM文风大模型:基于大语言模型的文本风格迁移项目。Text style transfer base on Large Language Model. #文字修饰 # 润色 #风格模仿☆286Updated 9 months ago
- llama-omni训练代码复现☆57Updated 2 months ago
- MooER: Moore-threads Open Omni model for speech-to-speech intERaction. MooER-omni includes a series of end-to-end speech interaction mode…☆198Updated 2 months ago