Sample Repository for the AlibabaCloud Bailian Speech SDK
☆410Dec 19, 2025Updated 5 months ago
Alternatives and similar repositories for alibabacloud-bailian-speech-demo
Users that are interested in alibabacloud-bailian-speech-demo are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- 内容审核及速率限制服务☆26May 18, 2025Updated last year
- “alibabacloud-nls-python-sdk提供使用阿里云智能语音服务的能力,包括语音识别、语音合成、文件转写等。”☆82Aug 22, 2025Updated 9 months ago
- RTC AIGC Demo☆276Mar 25, 2026Updated 2 months ago
- 百聆 是一个类似GPT-4o的语音对话机器人,通过ASR+LLM+TTS实现,集成DeepSeek R1等优秀大模型,接入openClaw,真正的个人语音助手,时延低至800ms,Mac等低配置也可运行,支持打断☆1,687Apr 6, 2026Updated last month
- ☆15Jul 4, 2024Updated last year
- Wordpress hosting with auto-scaling - Free Trial Offer • AdFully Managed hosting for WordPress and WooCommerce businesses that need reliable, auto-scalable performance. Cloudways SafeUpdates now available.
- This is a speech interaction system built on an open-source model, integrating ASR, LLM, and TTS in sequence. The ASR model is SenceVoice…☆1,193Mar 1, 2025Updated last year
- 安卓手机部署DeepSeek-R1 蒸馏的1.5B模型☆24Feb 4, 2025Updated last year
- 小智的视觉对话☆33Apr 25, 2025Updated last year
- Industrial-grade speech recognition toolkit: 170x realtime, 50+ languages, speaker diarization, emotion detection, streaming, and OpenAI-…☆16,264Updated this week
- Skribify is a powerful transcription and summarization tool that leverages the power of OpenAI's GPT-4 and WhisperAI to generate concise …☆12Apr 29, 2025Updated last year
- faster inference☆28Jan 20, 2025Updated last year
- CPU inference version of VisemeNet-tensorflow☆14Nov 6, 2019Updated 6 years ago
- Open-source industrial-grade ASR models supporting Mandarin, Chinese dialects and English, achieving a new SOTA on public Mandarin ASR be…☆1,887Feb 25, 2026Updated 3 months ago
- A Comprehensive Mandarin Speech Dataset for Young Children Aged 3-5☆52Mar 19, 2025Updated last year
- 1-Click AI Models by DigitalOcean Gradient • AdDeploy popular AI models on DigitalOcean Gradient GPU virtual machines with just a single click. Zero configuration with optimized deployments.
- ☆23Oct 30, 2024Updated last year
- Multi-lingual large voice generation model, providing inference, training and deployment full-stack ability.☆21,226May 3, 2026Updated 3 weeks ago
- ASR using OpenAI capability API `v1/audio/transcriptions` like Groq, SiliconFlow☆32Aug 29, 2024Updated last year
- Multilingual Voice Understanding Model☆8,216May 19, 2026Updated last week
- This is a multi-character, ultra-personalized StoryTeller. It includes: 1) efficiently and accurately build multi-character voice library…☆64Feb 2, 2025Updated last year
- Compute WER and SER for speech recognition evaluation☆26Mar 18, 2026Updated 2 months ago
- 通用简单工具项目☆22Oct 6, 2024Updated last year
- 这是一个用于连接小智AI服务的Python客户端库。它提供了简单的接口来进行语音对话和文本交互。☆25Mar 14, 2025Updated last year
- Voice agent using LiveKit (orchestration), Cartesia (TTS), OpenAI (LLM), and Deepgram (STT)☆21Oct 28, 2025Updated 6 months ago
- Managed Database hosting by DigitalOcean • AdPostgreSQL, MySQL, MongoDB, Kafka, Valkey, and OpenSearch available. Automatically scale up storage and focus on building your apps.
- 这是基于FunASR实现的区分说话人语音识别API | This is a speaker-diarization-based speech recognition API implemented using FunASR.☆26Feb 12, 2026Updated 3 months ago
- A library for adding punctuation into a text from ASR.☆19May 8, 2023Updated 3 years ago
- A enterprise-grade Voice Activity Detector from modelscope and funasr.☆136Apr 26, 2023Updated 3 years ago
- ☆11Mar 13, 2023Updated 3 years ago
- 大模型意图识别☆11Aug 14, 2024Updated last year
- Utilizes ONNX Runtime for audio denoising.☆125Dec 27, 2025Updated 4 months ago
- xuanxuan is an open source IM resolution.☆26Feb 17, 2017Updated 9 years ago
- The official repo of Qwen-Audio (通义千问-Audio) chat & pretrained large audio language model proposed by Alibaba Cloud.☆1,896Jul 5, 2024Updated last year
- chai2010 的博客☆12May 4, 2026Updated 3 weeks ago
- Deploy to Railway using AI coding agents - Free Credits Offer • AdUse Claude Code, Codex, OpenCode, and more. Autonomous software development now has the infrastructure to match with Railway.
- A Repository for Single- and Multi-modal Speaker Verification, Speaker Recognition and Speaker Diarization☆2,939Dec 8, 2025Updated 5 months ago
- 实时交互数字人,可自定义形象与音色,支持音色克隆,对话延迟低至3s。Real-time voice interactive digital human, customizable appearance and voice, supporting voice cloning,…☆1,248Dec 18, 2025Updated 5 months ago
- A solution to denoising and separating for two-speaker-mixed noisy speech, using a BSRNN inspired network.☆15Aug 22, 2023Updated 2 years ago
- An AI-Powered Speech Processing Toolkit and Open Source SOTA Pretrained Models, Supporting Speech Enhancement, Separation, and Target Spe…☆4,152Aug 14, 2025Updated 9 months ago
- ☆69Jul 17, 2024Updated last year
- An open-source chat text to control actions agentic workflow framework/showcase powered by Agently AI application development framework.☆29Mar 13, 2026Updated 2 months ago
- LangChain实现的基于PDF文档构建问答知识库☆39Apr 12, 2024Updated 2 years ago