A fast multimodal LLM for real-time voice
β4,412Dec 12, 2025Updated 4 months ago
Alternatives and similar repositories for ultravox
Users that are interested in ultravox are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- Open Source framework for voice and multimodal conversational AIβ11,687Updated this week
- A framework for building realtime voice AI agents π€ποΈπΉβ10,353Updated this week
- Moshi is a speech-text foundation model and full-duplex spoken dialogue framework. It uses Mimi, a state-of-the-art streaming neural audiβ¦β10,111Apr 28, 2026Updated last week
- A robust, efficient, low-latency speech-to-text library with advanced voice activity detection, wake word activation and instant transcriβ¦β9,757Mar 14, 2026Updated last month
- A Gemini 2.5 Flash Level MLLM for Vision, Speech, and Full-Duplex Multimodal Live Streaming on Your Phoneβ24,494Apr 27, 2026Updated last week
- End-to-end encrypted cloud storage - Proton Drive β’ AdSpecial offer: 40% Off Yearly / 80% Off First Month. Protect your most important files, photos, and documents from prying eyes.
- Inference and training library for high-quality TTS models.β5,572Dec 10, 2024Updated last year
- first base model for full-duplex conversational audioβ1,788Jan 5, 2025Updated last year
- SOTA Open Source TTSβ30,034Apr 6, 2026Updated last month
- Open-source framework for conversational voice AI agentsβ10,462Apr 30, 2026Updated last week
- Run agents as production software.β39,835Updated this week
- An LLM-powered knowledge curation system that researches a topic and generates a full-length report with citations.β28,150Sep 30, 2025Updated 7 months ago
- Amphion (/Γ¦mΛfaΙͺΙn/) is a toolkit for Audio, Music, and Speech Generation. Its purpose is to support reproducible research and help junioβ¦β9,782Mar 25, 2026Updated last month
- LLaMA-Omni is a low-latency and high-quality end-to-end speech interaction model built upon Llama-3.1-8B-Instruct, aiming to achieve speeβ¦β3,139May 19, 2025Updated 11 months ago
- LLM-Driven Extraction of Unstructured Data β Built for API Deployments & ETL Pipeline Workflowsβ6,557Apr 30, 2026Updated last week
- Wordpress hosting with auto-scaling - Free Trial Offer β’ AdFully Managed hosting for WordPress and WooCommerce businesses that need reliable, auto-scalable performance. Cloudways SafeUpdates now available.
- Open-source text-to-SQL and text-to-chart GenBI agent with a semantic layer. Ask your database questions in natural language β get accuraβ¦β15,061Updated this week
- Local realtime voice AIβ2,484Nov 26, 2025Updated 5 months ago
- The python library for real-time communicationβ4,587Jan 12, 2026Updated 3 months ago
- Foundational model for human-like, expressive TTSβ4,193Jul 30, 2024Updated last year
- The SDK For Browser Agentsβ22,463Updated this week
- Your AI second brain. Self-hostable. Get answers from the web or your docs. Build custom agents, schedule automations, do deep research. β¦β34,344Mar 26, 2026Updated last month
- Zero-Shot Speech Editing and Text-to-Speech in the Wildβ8,479Mar 15, 2025Updated last year
- Scira (Formerly MiniPerplx) is a minimalistic AI-powered search engine that helps you find information on the internet and cites it too. β¦β11,632Mar 20, 2026Updated last month
- Build local voice agents with open-source modelsβ4,716Updated this week
- Virtual machines for every use case on DigitalOcean β’ AdGet dependable uptime with 99.99% SLA, simple security tools, and predictable monthly pricing with DigitalOcean's virtual machines, called Droplets.
- tiny vision language modelβ9,651Apr 20, 2026Updated 2 weeks ago
- StyleTTS 2: Towards Human-Level Text-to-Speech through Style Diffusion and Adversarial Training with Large Speech Language Modelsβ6,247Aug 10, 2024Updated last year
- An open-source RAG-based tool for chatting with your documents.β25,350Apr 3, 2026Updated last month
- Automate browser based workflows with AIβ21,491Updated this week
- Run agents that work based on what you do. 24/7 local screen & mic recording for the superintelligence eraβ18,496Updated this week
- ππ€ Crawl4AI: Open-source LLM Friendly Web Crawler & Scraper. Don't be shy, join here: https://discord.gg/jP8KfhDhyNβ64,964Apr 30, 2026Updated last week
- π€ Build voice-based LLM agents. Modular + open source.β3,732Nov 15, 2024Updated last year
- Vane is an AI-powered answering engine.β34,125Apr 11, 2026Updated 3 weeks ago
- Instant voice cloning by MIT and MyShell. Audio foundation model.β36,413Apr 19, 2025Updated last year
- Deploy on Railway without the complexity - Free Credits Offer β’ AdConnect your repo and Railway handles the rest with instant previews. Quickly provision container image services, databases, and storage volumes.
- Silero VAD: pre-trained enterprise-grade Voice Activity Detectorβ8,933Mar 26, 2026Updated last month
- Universal memory layer for AI Agentsβ54,714Apr 30, 2026Updated last week
- π OpenHands: AI-Driven Developmentβ72,542Updated this week
- NeMo Retriever Library is a scalable, performance-oriented document content and metadata extraction microservice. NeMo Retriever extractiβ¦β2,920Updated this week
- Web UI for training and running open models like Gemma 4, Qwen3.6, DeepSeek, gpt-oss locally.β63,536Updated this week
- Letta is the platform for building stateful agents: AI with advanced memory that can learn and self-improve over time.β22,391Apr 12, 2026Updated 3 weeks ago
- LLM-powered multiagent persona simulation for imagination enhancement and business insights.β7,430Apr 28, 2026Updated last week