A fast multimodal LLM for real-time voice
β4,424Dec 12, 2025Updated 5 months ago
Alternatives and similar repositories for ultravox
Users that are interested in ultravox are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- Open Source framework for voice and multimodal conversational AIβ12,468Updated this week
- A framework for building realtime voice AI agents π€ποΈπΉβ10,704Updated this week
- Moshi is a speech-text foundation model and full-duplex spoken dialogue framework. It uses Mimi, a state-of-the-art streaming neural audiβ¦β10,261May 16, 2026Updated last week
- A robust, efficient, low-latency speech-to-text library with advanced voice activity detection, wake word activation and instant transcriβ¦β9,811Updated this week
- Inference and training library for high-quality TTS models.β5,573Dec 10, 2024Updated last year
- Wordpress hosting with auto-scaling - Free Trial Offer β’ AdFully Managed hosting for WordPress and WooCommerce businesses that need reliable, auto-scalable performance. Cloudways SafeUpdates now available.
- A Pocket-Sized MLLM for Ultra-Efficient Image and Video Understanding on Your Phoneβ25,116May 19, 2026Updated last week
- first base model for full-duplex conversational audioβ1,788Jan 5, 2025Updated last year
- SOTA Open Source TTSβ30,474May 12, 2026Updated 2 weeks ago
- Open-source framework for conversational voice AI agentsβ10,600May 20, 2026Updated last week
- Build, run, and manage agent platforms.β40,307Updated this week
- An LLM-powered knowledge curation system that researches a topic and generates a full-length report with citations.β28,257Sep 30, 2025Updated 7 months ago
- Amphion (/Γ¦mΛfaΙͺΙn/) is a toolkit for Audio, Music, and Speech Generation. Its purpose is to support reproducible research and help junioβ¦β9,816Mar 25, 2026Updated 2 months ago
- LLaMA-Omni is a low-latency and high-quality end-to-end speech interaction model built upon Llama-3.1-8B-Instruct, aiming to achieve speeβ¦β3,142May 19, 2025Updated last year
- LLM-Driven Extraction of Unstructured Data β Built for API Deployments & ETL Pipeline Workflowsβ6,585Updated this week
- Virtual machines for every use case on DigitalOcean β’ AdGet dependable uptime with 99.99% SLA, simple security tools, and predictable monthly pricing with DigitalOcean's virtual machines, called Droplets.
- Give AI agents the context to query business data correctly through the open context layer that gives AI agents grounded, governed memoryβ¦β15,334Updated this week
- Local realtime voice AIβ2,482Nov 26, 2025Updated 6 months ago
- The python library for real-time communicationβ4,587Jan 12, 2026Updated 4 months ago
- Foundational model for human-like, expressive TTSβ4,197Jul 30, 2024Updated last year
- The SDK For Browser Agentsβ22,722May 19, 2026Updated last week
- Your AI second brain. Self-hostable. Get answers from the web or your docs. Build custom agents, schedule automations, do deep research. β¦β34,669Mar 26, 2026Updated 2 months ago
- Zero-Shot Speech Editing and Text-to-Speech in the Wildβ8,485Mar 15, 2025Updated last year
- Scira (Formerly MiniPerplx) is a minimalistic AI-powered search engine that helps you find information on the internet and cites it too. β¦β11,672Mar 20, 2026Updated 2 months ago
- Build local voice agents with open-source modelsβ4,755May 21, 2026Updated last week
- Deploy on Railway without the complexity - Free Credits Offer β’ AdConnect your repo and Railway handles the rest with instant previews. Quickly provision container image services, databases, and storage volumes.
- tiny vision language modelβ9,707Apr 20, 2026Updated last month
- StyleTTS 2: Towards Human-Level Text-to-Speech through Style Diffusion and Adversarial Training with Large Speech Language Modelsβ6,263Aug 10, 2024Updated last year
- An open-source RAG-based tool for chatting with your documents.β25,394Apr 3, 2026Updated last month
- Automate browser based workflows with AIβ21,733Updated this week
- YC (S26) | Give AI the ability to live your experience. Records everything you do, say, hear 24/7, local, private, secureβ18,877Updated this week
- ππ€ Crawl4AI: Open-source LLM Friendly Web Crawler & Scraper. Don't be shy, join here: https://discord.gg/jP8KfhDhyNβ66,299Updated this week
- π€ Build voice-based LLM agents. Modular + open source.β3,750Nov 15, 2024Updated last year
- Vane is an AI-powered answering engine.β34,887Apr 11, 2026Updated last month
- Instant voice cloning by MIT and MyShell. Audio foundation model.β36,558Apr 19, 2025Updated last year
- Deploy on Railway without the complexity - Free Credits Offer β’ AdConnect your repo and Railway handles the rest with instant previews. Quickly provision container image services, databases, and storage volumes.
- Silero VAD: pre-trained enterprise-grade Voice Activity Detectorβ9,110Mar 26, 2026Updated 2 months ago
- Universal memory layer for AI Agentsβ56,739Updated this week
- π OpenHands: AI-Driven Developmentβ74,798Updated this week
- NeMo Retriever Library is a scalable, performance-oriented document content and metadata extraction microservice. NeMo Retriever extractiβ¦β2,923May 20, 2026Updated last week
- Unsloth Studio is a web UI for training and running open models like Gemma 4, Qwen3.6, DeepSeek, gpt-oss locally.β65,100Updated this week
- Letta is the platform for building stateful agents: AI with advanced memory that can learn and self-improve over time.β22,903May 14, 2026Updated last week
- LLM-powered multiagent persona simulation for imagination enhancement and business insights.β7,464May 7, 2026Updated 2 weeks ago