QwenLM / Qwen3-ASR-ToolkitLinks
Official Python toolkit for the Qwen3-ASR API. Parallel high‑throughput calls, robust long‑audio transcription, multi‑sample‑rate support.
☆684Updated last month
Alternatives and similar repositories for Qwen3-ASR-Toolkit
Users that are interested in Qwen3-ASR-Toolkit are comparing it to the libraries listed below
Sorting:
- ☆527Updated last month
- MiMo-Audio: Audio Language Models are Few-Shot Learners☆851Updated 2 months ago
- ☆621Updated 3 weeks ago
- ☆467Updated 6 months ago
- Googles NotebookLM but local☆617Updated 2 months ago
- Official MiniMax Model Context Protocol (MCP) server that enables interaction with powerful Text to Speech, image generation and video ge…☆1,088Updated this week
- A real-time Electron-based desktop GUI for DeepSeek-OCR☆649Updated 3 weeks ago
- A minimal yet professional single agent demo project that showcases the core execution pipeline and production-grade features of agents.☆572Updated last week
- A powerful 3B-parameter, LLM-based Reinforcement Learning audio edit model excels at editing emotion, speaking style, and paralinguistics…☆650Updated this week
- ☆635Updated last week
- An open-source implementation of Whisper☆455Updated 3 weeks ago
- ☆468Updated 6 months ago
- VoiceStar: Robust, Duration-controllable TTS that can Extrapolate☆296Updated 5 months ago
- VoxCPM: Tokenizer-Free TTS for Context-Aware Speech Generation and True-to-Life Voice Cloning☆2,099Updated last month
- Tencent Hunyuan A13B (short as Hunyuan-A13B), an innovative and open-source LLM built on a fine-grained MoE architecture.☆805Updated 4 months ago
- Open-source framework for developing real-time multimodal conversational AI agents.☆522Updated last week
- ☆1,024Updated 3 weeks ago
- Omnilingual ASR Open-Source Multilingual SpeechRecognition for 1600+ Languages☆2,019Updated last week
- Learn to build and deploy local Visual Language Models for Edge AI☆322Updated 3 weeks ago
- VibeVoice: Expressive, longform conversational speech synthesis. (Community fork)☆734Updated 3 weeks ago
- Model-agnostic plug-n-play LangChain/LangGraph agents powered entirely by MCP tools over HTTP/SSE.☆736Updated last month
- A quick vibe coded app for deepseek OCR☆1,422Updated this week
- Generate Web Pages and Components with text prompts, with Local Models. (or Cloud Models, if you want) - now supports Thinking Models!☆397Updated 4 months ago
- Unlimited text-to-speech in the Browser using Kokoro-JS, 100% local, 100% open source☆309Updated 5 months ago
- Build, enrich, and transform datasets using AI models with no code☆1,564Updated 3 weeks ago
- ☆312Updated 2 months ago
- ☆774Updated last month
- A command-line interface tool for serving LLM using vLLM.☆443Updated last month
- A powerful browser assistant for vibe surfing☆199Updated last week
- Step-Audio 2 is an end-to-end multi-modal large language model designed for industry-strength audio understanding and speech conversation…☆1,226Updated 2 months ago