opendilab / CleanS2SLinks
High-quality and streaming Speech-to-Speech interactive agent in a single file. 只用一个文件实现的流式全双工语音交互原型智能体!
☆481Updated 3 weeks ago
Alternatives and similar repositories for CleanS2S
Users that are interested in CleanS2S are comparing it to the libraries listed below
Sorting:
- Stream-Omni is a GPT-4o-like language-vision-speech chatbot that simultaneously supports interaction across various modality combinations…☆377Updated 6 months ago
- PengChengStarling is specifically designed for developing multilingual ASR models based on the icefall project, supporting a complete ASR…☆184Updated 10 months ago
- ☆204Updated last year
- Video QA Assistant based on LLMs with frame convolution☆216Updated 2 years ago
- StreamSpeech is an “All in One” seamless model for offline and simultaneous speech recognition, speech translation and speech synthesis.☆1,218Updated 6 months ago
- ✨✨Freeze-Omni: A Smart and Low Latency Speech-to-speech Dialogue Model with Frozen LLM☆360Updated 7 months ago
- Multi-Agent-GPT: 一款基于RAG和agent构建的多模态专家助手GPT。它集成了文本、图像和音频等模态工具。支持本地部署和私有数据库建设。☆249Updated 10 months ago
- Graphrag的api扩展,可通过api调用,以嵌入在自己的web服务☆116Updated last year
- StyleLLM文风大模型:基于大语言模型的文本风格迁移项目。Text style transfer base on Large Language Model. #文字修饰 # 润色 #风格模仿☆345Updated last year
- Pseudo Streaming SenseVoice with Hotwords☆412Updated 9 months ago
- ☆194Updated last year
- A package for parsing PDFs and analyzing their content using LLMs.☆270Updated last year
- 使用vllm加速cosyvoice2的推理☆465Updated 8 months ago
- 通用大模型 × 文风大模型 = 多样化风格的聊天机器人☆51Updated last year
- A Fully Self-Hosted Solution for Full-Duplex Voice Interaction☆461Updated 3 months ago
- GPT-4o-level, real-time spoken dialogue system.☆363Updated 11 months ago
- 基于Node.js、Vue3、uniapp的ChatGPT+智能体+Midjourney绘画+PPT生成+Suno音乐+Pika/Runway/Sora视频 网页服务 | 个人、团队、企业私有化AIGC平台☆274Updated last month
- EasyRAG是一个基于向量数据库的轻量级知识库检索系统,专为本地部署设计,对硬件要求极低。系统融合了现代数据治理理念,提供了友好的Web界面,支持多种文件格式的导入、分块和检索功能。该系统使用了检索增强生成(RAG)技术,可以有效地管理和检索大规模文档,同时确保数据的质量…☆124Updated 5 months ago
- 修复funasr中seaco-paraformer导出onnx后没有时间戳的bug☆24Updated last year
- Baichuan-Audio: A Unified Framework for End-to-End Speech Interaction☆216Updated 10 months ago
- Toolkit for Prompt Compression☆285Updated 10 months ago
- Official code for"DiaMoE-TTS: A Unified IPA-based Dialect TTS Framework with Mixture-of-Experts and Parameter-Efficient Zero-Shot Adaptat…☆210Updated last month
- Paper list of simultaneous translation / streaming translation, including text-to-text machine translation and speech-to-text translation…☆578Updated last year
- MooER: Moore-threads Open Omni model for speech-to-speech intERaction. MooER-omni includes a series of end-to-end speech interaction mode…☆219Updated last year
- 本项目开源基于NextJS的前端, 希望能够提供一个用于生成式AI的文字转视频, 尤其是电影从编剧到视频生成的Web前端平台参考。Everyone can become a director. The Nextjs front-end of an AI driven pla…☆197Updated last year
- Source code for ACL 2023 paper "End-to-End Simultaneous Speech Translation with Differentiable Segmentation"☆37Updated 2 years ago
- ☆113Updated last year
- 🤗 R1-AQA Model: mispeech/r1-aqa☆311Updated 9 months ago
- ☆482Updated 8 months ago
- A Template Based Report Rendering Platform.☆330Updated last year