Sample Repository for the AlibabaCloud Bailian Speech SDK
☆383Dec 19, 2025Updated 2 months ago
Alternatives and similar repositories for alibabacloud-bailian-speech-demo
Users that are interested in alibabacloud-bailian-speech-demo are comparing it to the libraries listed below
Sorting:
- 内容审核及速率限制服务☆26May 18, 2025Updated 9 months ago
- “alibabacloud-nls-python-sdk提供使用阿里云智能语音服务的能力,包括语音识别、语音合成、文件转写等。”☆79Aug 22, 2025Updated 6 months ago
- This is a speech interaction system built on an open-source model, integrating ASR, LLM, and TTS in sequence. The ASR model is SenceVoice…☆1,134Mar 1, 2025Updated last year
- silero-vad pytorch implement☆36Nov 23, 2024Updated last year
- 百聆 是一个类似GPT-4o的语音对话机器人,通过ASR+LLM+TTS实现,集成DeepSeek R1等优秀大模型,时延低至800ms,Mac等低配置也可运行,支持打断☆1,622Jul 31, 2025Updated 7 months ago
- 小智的视觉对话☆32Apr 25, 2025Updated 10 months ago
- faster inference☆28Jan 20, 2025Updated last year
- ☆15Jul 4, 2024Updated last year
- RTC AIGC Demo☆250Nov 19, 2025Updated 3 months ago
- A Fundamental End-to-End Speech Recognition Toolkit and Open Source SOTA Pretrained Models, Supporting Speech Recognition, Voice Activity…☆15,139Feb 28, 2026Updated last week
- 安卓手机部署DeepSeek-R1 蒸馏的1.5B模型☆23Feb 4, 2025Updated last year
- ☆23Oct 30, 2024Updated last year
- 这是一个用于连接小智AI服务的Python客户端库。它提供了简单的接口来进行语音对话和文本交互。☆26Mar 14, 2025Updated 11 months ago
- ☆11Mar 13, 2023Updated 2 years ago
- A Chrome DevTools Extension for OpenSumi.☆14Apr 22, 2024Updated last year
- pytorch+bert实现的意图识别与槽位填充☆11May 30, 2023Updated 2 years ago
- 自用,语音到文本用的sencevoice,llm部分基于ollama的API调用,文本到语音用的cosyvoice,实时语音输入参考的https://github.com/ABexit/ASR-LLM-TTS。☆12Dec 26, 2024Updated last year
- Open-source industrial-grade ASR models supporting Mandarin, Chinese dialects and English, achieving a new SOTA on public Mandarin ASR be…☆1,788Feb 25, 2026Updated 2 weeks ago
- ☆10May 27, 2025Updated 9 months ago
- Vue移动商城项目,练习Vue时的demo。☆10Jan 6, 2023Updated 3 years ago
- Automate the batch upload and parsing of documents into Dify's knowledge base, reducing manual intervention and wait time.☆14Aug 29, 2024Updated last year
- ASR using OpenAI capability API `v1/audio/transcriptions` like Groq, SiliconFlow☆32Aug 29, 2024Updated last year
- An open-source chat text to control actions agentic workflow framework/showcase powered by Agently AI application development framework.☆29Sep 20, 2024Updated last year
- Multilingual Voice Understanding Model☆7,669Dec 30, 2025Updated 2 months ago
- 一个用于CosyVoice的api接口项目☆336Aug 31, 2025Updated 6 months ago
- Multi-lingual large voice generation model, providing inference, training and deployment full-stack ability.☆19,913Feb 11, 2026Updated last month
- ☆22Jul 30, 2025Updated 7 months ago
- WhisperMesh is an advanced chatbot that integrates voice and text interactions, delivering personalized responses through LLM models and …☆15Apr 23, 2025Updated 10 months ago
- 这是基于FunASR实 现的区分说话人语音识别API | This is a speaker-diarization-based speech recognition API implemented using FunASR.☆23Feb 12, 2026Updated last month
- ☆33Feb 28, 2025Updated last year
- This is a multi-character, ultra-personalized StoryTeller. It includes: 1) efficiently and accurately build multi-character voice library…☆58Feb 2, 2025Updated last year
- chai2010 的博客☆12Feb 18, 2026Updated 3 weeks ago
- CPU inference version of VisemeNet-tensorflow☆14Nov 6, 2019Updated 6 years ago
- Voice agent using LiveKit (orchestration), Cartesia (TTS), OpenAI (LLM), and Deepgram (STT)☆20Oct 28, 2025Updated 4 months ago
- golang use ffmpeg to mix the video☆11May 28, 2023Updated 2 years ago
- LiveKit + Next.js AI voice agent interface☆16Feb 21, 2025Updated last year
- OCRFusion is an integrated solution that combines multiple open-source OCR (Optical Character Recognition) models, layout analysis, and t…☆16Jul 30, 2024Updated last year
- ☆69Jul 17, 2024Updated last year
- Pseudo Streaming SenseVoice with Hotwords☆434Mar 13, 2025Updated 11 months ago