TheStageAI/TheWhisper

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/TheStageAI/TheWhisper)

TheStageAI / TheWhisper

Optimized Whisper models for streaming and on-device use

☆892

Alternatives and similar repositories for TheWhisper

Users that are interested in TheWhisper are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

allenai / OLMoASR
View on GitHub
An open-source implementation of Whisper
☆492Oct 29, 2025Updated 8 months ago
neuphonic / neutts
View on GitHub
On-device TTS model by Neuphonic
☆6,196Updated this week
evalops / cognitive-dissonance-dspy
View on GitHub
A multi-agent LLM system for detecting and resolving cognitive dissonance.
☆282Apr 25, 2026Updated 2 months ago
bytedance / USO
View on GitHub
[CVPR 2026] 🔥🔥 Official Repo of USO: Unified Style and Subject-Driven Generation via Disentangled and Reward Learning
☆1,227Sep 12, 2025Updated 10 months ago
nari-labs / dia2
View on GitHub
TTS model capable of streaming conversational audio in realtime.
☆1,159Nov 29, 2025Updated 7 months ago
1-Click AI Models by DigitalOcean Gradient • Ad
Deploy popular AI models on DigitalOcean Gradient GPU virtual machines with just a single click. Zero configuration with optimized deployments.
OpenGVLab / NaViL
View on GitHub
☆95Oct 10, 2025Updated 9 months ago
FoundationAgents / ReCode
View on GitHub
Next paradigm for LLM Agent. Unify plan and action through recursive code generation for adaptive, human-like decision-making.
☆561Apr 21, 2026Updated 3 months ago
zhengxuJosh / Awesome-Multimodal-Spatial-Reasoning
View on GitHub
This repository collects and organises state‑of‑the‑art papers on spatial reasoning for Multimodal Vision–Language Models (MVLMs).
☆319Feb 17, 2026Updated 5 months ago
Liquid4All / liquid-audio
View on GitHub
Liquid Audio - Speech-to-Speech audio models by Liquid AI
☆548Jun 5, 2026Updated last month
CaviraOSS / OpenMemory
View on GitHub
Local persistent memory store for LLM applications including claude desktop, github copilot, codex, antigravity, etc.
☆4,372Updated this week
facebookresearch / omnilingual-asr
View on GitHub
Omnilingual ASR Open-Source Multilingual SpeechRecognition for 1600+ Languages
☆2,859Dec 30, 2025Updated 6 months ago
herimor / voxtream
View on GitHub
VoXtream is a Full-Stream Zero-shot TTS model with Extremely Low Latency and Speaking rate Control
☆245May 30, 2026Updated last month
videosdk-live / agents
View on GitHub
Open-source framework for developing real-time multimodal conversational AI agents.
☆630Updated this week
Danau5tin / Orca-Agent-RL
View on GitHub
Qwen3-14B Orchestrator Agent Reinforcement Learning. **Achieved 160% improvement** on Stanford's TerminalBench
☆102Nov 3, 2025Updated 8 months ago
GPU virtual machines on DigitalOcean Gradient AI • Ad
Get to production fast with high-performance AMD and NVIDIA GPUs you can spin up in seconds. The definition of operational simplicity.
gensyn-ai / codeassist
View on GitHub
A completely private and local AI coding assistant, developed by Gensyn. It helps you practice programming problems and train a novel ass…
☆703Mar 2, 2026Updated 4 months ago
ysharma3501 / FlashSR
View on GitHub
Fast audio super resolution from 16khz to 48khz.
☆215Jan 3, 2026Updated 6 months ago
K-Dense-AI / karpathy
View on GitHub
An agentic Machine Learning Engineer
☆1,513May 29, 2026Updated last month
zai-org / GLM-ASR
View on GitHub
GLM-ASR-Nano: A robust, open-source speech recognition model with 1.5B parameters
☆836Mar 6, 2026Updated 4 months ago
supertone-inc / supertonic
View on GitHub
Lightning-Fast, On-Device, Multilingual TTS — running natively via ONNX.
☆13,500Updated this week
iamanigeeit / present
View on GitHub
☆14Aug 19, 2024Updated last year
RohanAdwankar / cgpu
View on GitHub
CLI enabling free cloud GPU access in your terminal for learning CUDA.
☆143Nov 30, 2025Updated 7 months ago
QwenLM / Qwen3-ASR-Toolkit
View on GitHub
Official Python toolkit for the Qwen3-ASR API. Parallel high‑throughput calls, robust long‑audio transcription, multi‑sample‑rate support…
☆981Feb 5, 2026Updated 5 months ago
dzhng / claude-agent-server
View on GitHub
Run Claude Agent (Claude Code) in a sandbox, control it via websocket
☆581Dec 28, 2025Updated 6 months ago
AI Agents on DigitalOcean Gradient AI Platform • Ad
Build production-ready AI agents using customizable tools or access multiple LLMs through a single endpoint. Create custom knowledge bases or connect external data.
idiap / knn-tts
View on GitHub
Simple and lightweight Zero-shot Text-to-Speech (TTS) synthesis model
☆36Apr 29, 2025Updated last year
Amal-David / keyleak-detector
View on GitHub
Runtime leak detector for modern web apps — finds exposed API keys, validates BaaS misconfigurations (Supabase/Firebase RLS), and catches…
☆264Updated this week
Vyvo-Labs / VyvoTTS
View on GitHub
VyvoTTS: LLM-Based Text-to-Speech Training Framework
☆257Apr 8, 2026Updated 3 months ago
Danau5tin / multi-agent-coding-system
View on GitHub
Reached #13 on Stanford's Terminal Bench leaderboard. Orchestrator, explorer & coder agents working together with intelligent context sha…
☆1,413Nov 3, 2025Updated 8 months ago
microsoft / VibeVoice
View on GitHub
Open-Source Frontier Voice AI
☆50,472Updated this week
nrposner / coding_club
View on GitHub
A repo for my CCN Coding Club talk, 'Python, Rust, and You: Modern Py-Rust Interoperation'
☆45Aug 12, 2025Updated 11 months ago
VectorSpaceLab / general-agentic-memory
View on GitHub
A general memory system for agents, powered by deep-research
☆857Mar 14, 2026Updated 4 months ago
espresso3389 / MioTTS-llama.cpp
View on GitHub
A fast, lightweight text-to-speech tool that runs entirely on your CPU. Give it text, pick a voice, and get a WAV file out.
☆67Feb 22, 2026Updated 5 months ago
Ido-Levi / Hephaestus
View on GitHub
Semi-Structured Agentic Framework. Workflows build themselves as agents discover what needs to be done, not what you predicted upfront.
☆1,177Dec 1, 2025Updated 7 months ago
Managed hosting for WordPress and PHP on Cloudways • Ad
Managed hosting for WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Deploy in minutes on Cloudways by DigitalOcean.
AkshathRaghav / tinyspeech
View on GitHub
Code release for "TinySpeech: Attention Condensers for Deep Speech Recognition Neural Networks on Edge Devices"
☆23Jun 7, 2025Updated last year
kyutai-labs / pocket-tts
View on GitHub
A TTS that fits in your CPU (and pocket)
☆7,868Jul 16, 2026Updated last week
Deep-unlearning / Llasa-GRPO
View on GitHub
☆18Nov 19, 2025Updated 8 months ago
NVlabs / OmniVinci
View on GitHub
OmniVinci is an omni-modal LLM for joint understanding of vision, audio, and language.
☆674Feb 26, 2026Updated 4 months ago
pipecat-ai / pipecat
View on GitHub
Open Source framework for voice and multimodal conversational AI
☆13,687Updated this week
memodb-io / Acontext
View on GitHub
Agent Skills as a Memory Layer
☆3,584Jul 14, 2026Updated last week
jdh-algo / JoyTTS
View on GitHub
☆41Jul 15, 2025Updated last year