QwenLM / Qwen3-ASR-ToolkitLinks
Official Python toolkit for the Qwen3-ASR API. Parallel high‑throughput calls, robust long‑audio transcription, multi‑sample‑rate support.
☆742Updated 3 months ago
Alternatives and similar repositories for Qwen3-ASR-Toolkit
Users that are interested in Qwen3-ASR-Toolkit are comparing it to the libraries listed below
Sorting:
- GLM-ASR-Nano: A robust, open-source speech recognition model with 1.5B parameters☆684Updated 3 weeks ago
- MiMo-Audio: Audio Language Models are Few-Shot Learners☆962Updated 4 months ago
- ☆535Updated 3 months ago
- Fun-Audio-Chat is a Large Audio Language Model built for natural, low-latency voice interactions.☆711Updated this week
- ☆681Updated 3 weeks ago
- A powerful 3B-parameter, LLM-based Reinforcement Learning audio edit model excels at editing emotion, speaking style, and paralinguistics…☆829Updated this week
- Liquid Audio - Speech-to-Speech audio models by Liquid AI☆371Updated 3 weeks ago
- ☆487Updated this week
- ☆473Updated 8 months ago
- Step-Audio 2 is an end-to-end multi-modal large language model designed for industry-strength audio understanding and speech conversation…☆1,313Updated 4 months ago
- GLM-TTS: Controllable & Emotion-Expressive Zero-shot TTS with Multi-Reward Reinforcement Learning☆889Updated last month
- ☆433Updated 3 weeks ago
- Googles NotebookLM but local☆801Updated last month
- ☆635Updated 2 months ago
- An Automatic Prompt Optimization Framework for Large Language Models☆878Updated 5 months ago
- "OpenPhone: Mobile Agentic Foundation Models for AI Phone"☆588Updated last month
- An open-source implementation of Whisper☆475Updated 2 months ago
- Omnilingual ASR Open-Source Multilingual SpeechRecognition for 1600+ Languages☆2,611Updated 3 weeks ago
- ☆483Updated 8 months ago
- TTS model capable of streaming conversational audio in realtime.☆1,023Updated last month
- A high-quality rapid TTS voice cloning model that reaches speeds of 150x realtime.☆336Updated this week
- Learn to build and deploy local Visual Language Models for Edge AI☆371Updated 2 months ago
- Model-agnostic plug-n-play LangChain/LangGraph agents powered entirely by MCP tools over HTTP/SSE.☆804Updated 3 months ago
- A real-time Electron-based desktop GUI for DeepSeek-OCR☆731Updated last month
- Open-source framework for developing real-time multimodal conversational AI agents.☆583Updated this week
- Fun-ASR is an end-to-end speech recognition large model launched by Tongyi Lab.☆793Updated last week
- ☆822Updated 3 months ago
- Tiny Model, Big Logic: Diversity-Driven Optimization Elicits Large-Model Reasoning Ability in VibeThinker-1.5B☆563Updated 2 months ago
- Official MiniMax Model Context Protocol (MCP) server that enables interaction with powerful Text to Speech, image generation and video ge…☆1,203Updated 2 weeks ago
- ☆1,084Updated 3 months ago