A fast multimodal LLM for real-time voice
β4,398Dec 12, 2025Updated 4 months ago
Alternatives and similar repositories for ultravox
Users that are interested in ultravox are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- Open Source framework for voice and multimodal conversational AIβ11,217Updated this week
- A framework for building realtime voice AI agents π€ποΈπΉβ10,054Updated this week
- Moshi is a speech-text foundation model and full-duplex spoken dialogue framework. It uses Mimi, a state-of-the-art streaming neural audiβ¦β10,010Mar 4, 2026Updated last month
- A robust, efficient, low-latency speech-to-text library with advanced voice activity detection, wake word activation and instant transcriβ¦β9,692Mar 14, 2026Updated last month
- A Gemini 2.5 Flash Level MLLM for Vision, Speech, and Full-Duplex Multimodal Live Streaming on Your Phoneβ24,365Apr 1, 2026Updated 2 weeks ago
- 1-Click AI Models by DigitalOcean Gradient β’ AdDeploy popular AI models on DigitalOcean Gradient GPU virtual machines with just a single click. Zero configuration with optimized deployments.
- Inference and training library for high-quality TTS models.β5,560Dec 10, 2024Updated last year
- first base model for full-duplex conversational audioβ1,784Jan 5, 2025Updated last year
- SOTA Open Source TTSβ29,257Apr 6, 2026Updated last week
- Open-source framework for conversational voice AI agentsβ10,390Updated this week
- Build, run, manage agentic software at scale.β39,343Apr 10, 2026Updated last week
- An LLM-powered knowledge curation system that researches a topic and generates a full-length report with citations.β28,073Sep 30, 2025Updated 6 months ago
- Amphion (/Γ¦mΛfaΙͺΙn/) is a toolkit for Audio, Music, and Speech Generation. Its purpose is to support reproducible research and help junioβ¦β9,754Mar 25, 2026Updated 3 weeks ago
- LLaMA-Omni is a low-latency and high-quality end-to-end speech interaction model built upon Llama-3.1-8B-Instruct, aiming to achieve speeβ¦β3,139May 19, 2025Updated 10 months ago
- LLM-Driven Extraction of Unstructured Data β Built for API Deployments & ETL Pipeline Workflowsβ6,537Updated this week
- Managed hosting for WordPress and PHP on Cloudways β’ AdManaged hosting for WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Deploy in minutes on Cloudways by DigitalOcean.
- Open-source text-to-SQL and text-to-chart GenBI agent with a semantic layer. Ask your database questions in natural language β get accuraβ¦β14,924Updated this week
- Local realtime voice AIβ2,479Nov 26, 2025Updated 4 months ago
- The python library for real-time communicationβ4,565Jan 12, 2026Updated 3 months ago
- Foundational model for human-like, expressive TTSβ4,196Jul 30, 2024Updated last year
- The SDK For Browser Agentsβ22,072Updated this week
- Your AI second brain. Self-hostable. Get answers from the web or your docs. Build custom agents, schedule automations, do deep research. β¦β34,018Mar 26, 2026Updated 3 weeks ago
- Scira (Formerly MiniPerplx) is a minimalistic AI-powered search engine that helps you find information on the internet and cites it too. β¦β11,606Mar 20, 2026Updated 3 weeks ago
- Zero-Shot Speech Editing and Text-to-Speech in the Wildβ8,469Mar 15, 2025Updated last year
- Build local voice agents with open-source modelsβ4,660Updated this week
- Managed hosting for WordPress and PHP on Cloudways β’ AdManaged hosting for WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Deploy in minutes on Cloudways by DigitalOcean.
- tiny vision language modelβ9,575Nov 14, 2025Updated 5 months ago
- StyleTTS 2: Towards Human-Level Text-to-Speech through Style Diffusion and Adversarial Training with Large Speech Language Modelsβ6,238Aug 10, 2024Updated last year
- An open-source RAG-based tool for chatting with your documents.β25,260Apr 3, 2026Updated 2 weeks ago
- Automate browser based workflows with AIβ21,143Updated this week
- Run agents that work for you based on what you do. AI finally knows what you are doingβ18,146Updated this week
- ππ€ Crawl4AI: Open-source LLM Friendly Web Crawler & Scraper. Don't be shy, join here: https://discord.gg/jP8KfhDhyNβ63,955Updated this week
- π€ Build voice-based LLM agents. Modular + open source.β3,723Nov 15, 2024Updated last year
- Vane is an AI-powered answering engine.β33,727Updated this week
- Instant voice cloning by MIT and MyShell. Audio foundation model.β36,216Apr 19, 2025Updated 11 months ago
- Managed hosting for WordPress and PHP on Cloudways β’ AdManaged hosting for WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Deploy in minutes on Cloudways by DigitalOcean.
- Universal memory layer for AI Agentsβ52,987Updated this week
- Silero VAD: pre-trained enterprise-grade Voice Activity Detectorβ8,819Mar 26, 2026Updated 3 weeks ago
- π OpenHands: AI-Driven Developmentβ71,108Updated this week
- NeMo Retriever Library is a scalable, performance-oriented document content and metadata extraction microservice. NeMo Retriever extractiβ¦β2,902Updated this week
- Unsloth Studio is a web UI for training and running open models like Gemma 4, Qwen3.5, DeepSeek, gpt-oss locally.β61,312Updated this week
- Letta is the platform for building stateful agents: AI with advanced memory that can learn and self-improve over time.β21,988Apr 8, 2026Updated last week
- LLM-powered multiagent persona simulation for imagination enhancement and business insights.β7,387Mar 28, 2026Updated 2 weeks ago