QwenLM / Qwen3-ASR-ToolkitLinks
Official Python toolkit for the Qwen3-ASR API. Parallel high‑throughput calls, robust long‑audio transcription, multi‑sample‑rate support.
☆605Updated 2 weeks ago
Alternatives and similar repositories for Qwen3-ASR-Toolkit
Users that are interested in Qwen3-ASR-Toolkit are comparing it to the libraries listed below
Sorting:
- ☆520Updated last week
- VoiceStar: Robust, Duration-controllable TTS that can Extrapolate☆285Updated 4 months ago
- MiMo-Audio: Audio Language Models are Few-Shot Learners☆731Updated 2 weeks ago
- ☆634Updated 2 months ago
- ☆464Updated 4 months ago
- An open-source implementation of Whisper☆439Updated this week
- Googles NotebookLM but local☆480Updated 3 weeks ago
- Generate Web Pages and Components with text prompts, with Local Models. (or Cloud Models, if you want) - now supports Thinking Models!☆391Updated 3 months ago
- Make text LLMs listen and speak☆904Updated last week
- ☆576Updated last month
- Open-source framework for developing real-time multimodal conversational AI agents.☆471Updated this week
- Tencent Hunyuan A13B (short as Hunyuan-A13B), an innovative and open-source LLM built on a fine-grained MoE architecture.☆754Updated 3 months ago
- Build, enrich, and transform datasets using AI models with no code☆1,502Updated this week
- ☆456Updated 5 months ago
- A command-line interface tool for serving LLM using vLLM.☆425Updated last month
- Fast Streaming TTS with Orpheus + WebRTC (with FastRTC)☆309Updated 6 months ago
- VibeVoice: Expressive, longform conversational speech synthesis. (Community fork)☆548Updated 2 weeks ago
- Kyutai with an "eye"☆221Updated 6 months ago
- Official MiniMax Model Context Protocol (MCP) server that enables interaction with powerful Text to Speech, image generation and video ge…☆966Updated 3 months ago
- ☆258Updated last month
- Step-Audio 2 is an end-to-end multi-modal large language model designed for industry-strength audio understanding and speech conversation…☆1,139Updated 2 weeks ago
- Self-host the ultra-lightweight Kitten TTS model with this enhanced API server with an intuitive Web UI, large text processing for audiob…☆202Updated 2 months ago
- VoxCPM: Tokenizer-Free TTS for Context-Aware Speech Generation and True-to-Life Voice Cloning☆1,641Updated last week
- ☆1,743Updated this week
- Build AI applications that can see, hear, and speak using your screens, microphones, and cameras as inputs.☆928Updated this week
- Model-agnostic plug-n-play LangChain/LangGraph agents powered entirely by MCP tools over HTTP/SSE.☆610Updated last month
- Long-form streaming TTS system for multi-speaker dialogue generation☆739Updated 3 weeks ago
- ☆300Updated 2 months ago
- VLLM Port of the Chatterbox TTS model☆308Updated last month
- AI tool for auto-research, TTS, and Graphical assembly into a completed Podcast☆80Updated 2 months ago