Sample Repository for the AlibabaCloud Bailian Speech SDK
☆396Dec 19, 2025Updated 3 months ago
Alternatives and similar repositories for alibabacloud-bailian-speech-demo
Users that are interested in alibabacloud-bailian-speech-demo are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- “alibabacloud-nls-python-sdk提供使用阿里云智能语音服务的能力,包括语音识别、语音合成、文件转写等。”☆82Aug 22, 2025Updated 7 months ago
- 百聆 是一个类似GPT-4o的语音对话机器人,通过ASR+LLM+TTS实现,集成DeepSeek R1等优秀大模型,接入openClaw,真正的个人语音助手,时延低至800ms,Mac等低配置也可运行,支持打断☆1,666Apr 6, 2026Updated last week
- ☆15Jul 4, 2024Updated last year
- This is a speech interaction system built on an open-source model, integrating ASR, LLM, and TTS in sequence. The ASR model is SenceVoice…☆1,162Mar 1, 2025Updated last year
- silero-vad pytorch implement☆36Nov 23, 2024Updated last year
- Wordpress hosting with auto-scaling - Free Trial • AdFully Managed hosting for WordPress and WooCommerce businesses that need reliable, auto-scalable performance. Cloudways SafeUpdates now available.
- 安卓手机部署DeepSeek-R1 蒸馏的1.5B模型☆24Feb 4, 2025Updated last year
- A Fundamental End-to-End Speech Recognition Toolkit and Open Source SOTA Pretrained Models, Supporting Speech Recognition, Voice Activity…☆15,643Mar 17, 2026Updated 3 weeks ago
- 小智的视觉对话☆33Apr 25, 2025Updated 11 months ago
- faster inference☆28Jan 20, 2025Updated last year
- CPU inference version of VisemeNet-tensorflow☆14Nov 6, 2019Updated 6 years ago
- RTC AIGC Demo☆260Mar 25, 2026Updated 3 weeks ago
- ESP32 component helps connect WiFi☆83Mar 22, 2026Updated 3 weeks ago
- Automate the batch upload and parsing of documents into Dify's knowledge base, reducing manual intervention and wait time.☆14Aug 29, 2024Updated last year
- Open-source industrial-grade ASR models supporting Mandarin, Chinese dialects and English, achieving a new SOTA on public Mandarin ASR be…☆1,837Feb 25, 2026Updated last month
- Managed hosting for WordPress and PHP on Cloudways • AdManaged hosting for WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Deploy in minutes on Cloudways by DigitalOcean.
- Multi-lingual large voice generation model, providing inference, training and deployment full-stack ability.☆20,533Mar 16, 2026Updated last month
- ☆23Oct 30, 2024Updated last year
- ASR using OpenAI capability API `v1/audio/transcriptions` like Groq, SiliconFlow☆32Aug 29, 2024Updated last year
- Multilingual Voice Understanding Model☆7,957Dec 30, 2025Updated 3 months ago
- This is a multi-character, ultra-personalized StoryTeller. It includes: 1) efficiently and accurately build multi-character voice library…☆62Feb 2, 2025Updated last year
- ☆24Feb 23, 2025Updated last year
- Compute WER and SER for speech recognition evaluation☆27Mar 18, 2026Updated 3 weeks ago
- 这是一个用于连接小智AI服务的Python客户端库。它提供了简单的接口来进行语音对话和文本交互。☆25Mar 14, 2025Updated last year
- Voice agent using LiveKit (orchestration), Cartesia (TTS), OpenAI (LLM), and Deepgram (STT)☆20Oct 28, 2025Updated 5 months ago
- Wordpress hosting with auto-scaling - Free Trial • AdFully Managed hosting for WordPress and WooCommerce businesses that need reliable, auto-scalable performance. Cloudways SafeUpdates now available.
- 这是基于FunASR实现的区分说话人语音识别API | This is a speaker-diarization-based speech recognition API implemented using FunASR.☆22Feb 12, 2026Updated 2 months ago
- A library for adding punctuation into a text from ASR.☆19May 8, 2023Updated 2 years ago
- A enterprise-grade Voice Activity Detector from modelscope and funasr.☆133Apr 26, 2023Updated 2 years ago
- ☆11Mar 13, 2023Updated 3 years ago
- Utilizes ONNX Runtime for audio denoising.☆121Dec 27, 2025Updated 3 months ago
- The official repo of Qwen-Audio (通义千问-Audio) chat & pretrained large audio language model proposed by Alibaba Cloud.☆1,888Jul 5, 2024Updated last year
- A Chinese Conversation Dataset with Rich Annotations for Super-Aged Seniors☆25Jul 30, 2025Updated 8 months ago
- A Repository for Single- and Multi-modal Speaker Verification, Speaker Recognition and Speaker Diarization☆2,884Dec 8, 2025Updated 4 months ago
- 实时交互数字人,可自定义形象与音色,支持音色克隆,对话延迟低至3s。Real-time voice interactive digital human, customizable appearance and voice, supporting voice cloning,…☆1,231Dec 18, 2025Updated 3 months ago
- AI Agents on DigitalOcean Gradient AI Platform • AdBuild production-ready AI agents using customizable tools or access multiple LLMs through a single endpoint. Create custom knowledge bases or connect external data.
- A solution to denoising and separating for two-speaker-mixed noisy speech, using a BSRNN inspired network.☆15Aug 22, 2023Updated 2 years ago
- OCRFusion is an integrated solution that combines multiple open-source OCR (Optical Character Recognition) models, layout analysis, and t…☆16Jul 30, 2024Updated last year
- An AI-Powered Speech Processing Toolkit and Open Source SOTA Pretrained Models, Supporting Speech Enhancement, Separation, and Target Spe…☆4,028Aug 14, 2025Updated 8 months ago
- ☆69Jul 17, 2024Updated last year
- An open-source chat text to control actions agentic workflow framework/showcase powered by Agently AI application development framework.☆29Mar 13, 2026Updated last month
- A Chrome DevTools Extension for OpenSumi.☆14Apr 22, 2024Updated last year
- ☆10May 27, 2025Updated 10 months ago