Sample Repository for the AlibabaCloud Bailian Speech SDK
☆403Dec 19, 2025Updated 4 months ago
Alternatives and similar repositories for alibabacloud-bailian-speech-demo
Users that are interested in alibabacloud-bailian-speech-demo are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- 内容审核及速率限制服务☆26May 18, 2025Updated 11 months ago
- “alibabacloud-nls-python-sdk提供使用阿里云智能语音服务的能力,包括语音识别、语音合成、文件转写等。”☆82Aug 22, 2025Updated 8 months ago
- RTC AIGC Demo☆267Mar 25, 2026Updated last month
- 百聆 是一个类似GPT-4o的语音对话机器人,通过ASR+LLM+TTS实现,集成DeepSeek R1等优秀大模型,接入openClaw,真正的个人语音助手,时延低至800ms,Mac等低配置也可运行,支持打断☆1,683Apr 6, 2026Updated last month
- This is a speech interaction system built on an open-source model, integrating ASR, LLM, and TTS in sequence. The ASR model is SenceVoice…☆1,179Mar 1, 2025Updated last year
- Deploy on Railway without the complexity - Free Credits Offer • AdConnect your repo and Railway handles the rest with instant previews. Quickly provision container image services, databases, and storage volumes.
- silero-vad pytorch implement☆36Nov 23, 2024Updated last year
- 安卓手机部署DeepSeek-R1 蒸馏的1.5B模型☆24Feb 4, 2025Updated last year
- A Fundamental End-to-End Speech Recognition Toolkit and Open Source SOTA Pretrained Models, Supporting Speech Recognition, Voice Activity…☆15,939Mar 17, 2026Updated last month
- 小智的视觉对话☆33Apr 25, 2025Updated last year
- faster inference☆28Jan 20, 2025Updated last year
- CPU inference version of VisemeNet-tensorflow☆14Nov 6, 2019Updated 6 years ago
- ESP32 component helps connect WiFi☆85Apr 24, 2026Updated last week
- Automate the batch upload and parsing of documents into Dify's knowledge base, reducing manual intervention and wait time.☆14Aug 29, 2024Updated last year
- Open-source industrial-grade ASR models supporting Mandarin, Chinese dialects and English, achieving a new SOTA on public Mandarin ASR be…☆1,862Feb 25, 2026Updated 2 months ago
- Deploy to Railway using AI coding agents - Free Credits Offer • AdUse Claude Code, Codex, OpenCode, and more. Autonomous software development now has the infrastructure to match with Railway.
- A Comprehensive Mandarin Speech Dataset for Young Children Aged 3-5☆52Mar 19, 2025Updated last year
- Multi-lingual large voice generation model, providing inference, training and deployment full-stack ability.☆20,846Updated this week
- ☆23Oct 30, 2024Updated last year
- ASR using OpenAI capability API `v1/audio/transcriptions` like Groq, SiliconFlow☆32Aug 29, 2024Updated last year
- Multilingual Voice Understanding Model☆8,072Dec 30, 2025Updated 4 months ago
- This is a multi-character, ultra-personalized StoryTeller. It includes: 1) efficiently and accurately build multi-character voice library…☆63Feb 2, 2025Updated last year
- ☆23Feb 23, 2025Updated last year
- Compute WER and SER for speech recognition evaluation☆27Mar 18, 2026Updated last month
- 这是一个用于连接小智AI服务的Python客户端库。它提供了简单的接口来进行语音对话和文本交互。☆25Mar 14, 2025Updated last year
- Deploy to Railway using AI coding agents - Free Credits Offer • AdUse Claude Code, Codex, OpenCode, and more. Autonomous software development now has the infrastructure to match with Railway.
- Voice agent using LiveKit (orchestration), Cartesia (TTS), OpenAI (LLM), and Deepgram (STT)☆21Oct 28, 2025Updated 6 months ago
- 这是基于FunASR实现的区分说话人语音识别API | This is a speaker-diarization-based speech recognition API implemented using FunASR.☆25Feb 12, 2026Updated 2 months ago
- A enterprise-grade Voice Activity Detector from modelscope and funasr.☆134Apr 26, 2023Updated 3 years ago
- ☆11Mar 13, 2023Updated 3 years ago
- Utilizes ONNX Runtime for audio denoising.☆123Dec 27, 2025Updated 4 months ago
- The official repo of Qwen-Audio (通义千问-Audio) chat & pretrained large audio language model proposed by Alibaba Cloud.☆1,899Jul 5, 2024Updated last year
- A Chinese Conversation Dataset with Rich Annotations for Super-Aged Seniors☆28Jul 30, 2025Updated 9 months ago
- A Repository for Single- and Multi-modal Speaker Verification, Speaker Recognition and Speaker Diarization☆2,919Dec 8, 2025Updated 4 months ago
- 实时交互数字人,可自定义形象与音色,支持音色克隆,对话延迟低至3s。Real-time voice interactive digital human, customizable appearance and voice, supporting voice cloning,…☆1,237Dec 18, 2025Updated 4 months ago
- Deploy to Railway using AI coding agents - Free Credits Offer • AdUse Claude Code, Codex, OpenCode, and more. Autonomous software development now has the infrastructure to match with Railway.
- A solution to denoising and separating for two-speaker-mixed noisy speech, using a BSRNN inspired network.☆15Aug 22, 2023Updated 2 years ago
- OCRFusion is an integrated solution that combines multiple open-source OCR (Optical Character Recognition) models, layout analysis, and t…☆16Jul 30, 2024Updated last year
- An AI-Powered Speech Processing Toolkit and Open Source SOTA Pretrained Models, Supporting Speech Enhancement, Separation, and Target Spe…☆4,105Aug 14, 2025Updated 8 months ago
- ☆69Jul 17, 2024Updated last year
- An open-source chat text to control actions agentic workflow framework/showcase powered by Agently AI application development framework.☆29Mar 13, 2026Updated last month
- LangChain实现的基于PDF文档构建问答知识库☆39Apr 12, 2024Updated 2 years ago
- 阿里云 · AUI Kits AI通话场景☆54Mar 2, 2026Updated 2 months ago