A fast multimodal LLM for real-time voice
☆4,447Dec 12, 2025Updated 6 months ago
Alternatives and similar repositories for ultravox
Users that are interested in ultravox are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- Open Source framework for voice and multimodal conversational AI☆12,842Updated this week
- A framework for building realtime voice AI agents 🤖🎙️📹☆10,933Updated this week
- Moshi is a speech-text foundation model and full-duplex spoken dialogue framework. It uses Mimi, a state-of-the-art streaming neural audi…☆10,397May 16, 2026Updated last month
- A robust, efficient, low-latency speech-to-text library with advanced voice activity detection, wake word activation and instant transcri…☆9,902Updated this week
- Inference and training library for high-quality TTS models.☆5,579Dec 10, 2024Updated last year
- GPU virtual machines on DigitalOcean Gradient AI • AdGet to production fast with high-performance AMD and NVIDIA GPUs you can spin up in seconds. The definition of operational simplicity.
- A Pocket-Sized MLLM for Ultra-Efficient Image and Video Understanding on Your Phone☆25,598Jun 4, 2026Updated last week
- first base model for full-duplex conversational audio☆1,788Jan 5, 2025Updated last year
- SOTA Open Source TTS☆30,816Jun 9, 2026Updated last week
- Open-source framework for conversational voice AI agents☆10,670Updated this week
- Build, run, and manage agent platforms.☆40,674Updated this week
- An LLM-powered knowledge curation system that researches a topic and generates a full-length report with citations.☆28,370Sep 30, 2025Updated 8 months ago
- Amphion (/æmˈfaɪən/) is a toolkit for Audio, Music, and Speech Generation. Its purpose is to support reproducible research and help junio…☆9,842Mar 25, 2026Updated 2 months ago
- LLaMA-Omni is a low-latency and high-quality end-to-end speech interaction model built upon Llama-3.1-8B-Instruct, aiming to achieve spee…☆3,142May 19, 2025Updated last year
- LLM-Driven Extraction of Unstructured Data — Built for API Deployments & ETL Pipeline Workflows☆6,651Updated this week
- GPUs on demand by Runpod - Special Offer Available • AdRun AI, ML, and HPC workloads on powerful cloud GPUs—without limits or wasted spend. Deploy GPUs in under a minute and pay by the second.
- Give AI agents the context to query business data correctly through the open context layer that gives AI agents grounded, governed memory…☆15,560Updated this week
- Local realtime voice AI☆2,484Nov 26, 2025Updated 6 months ago
- The python library for real-time communication☆4,606Jan 12, 2026Updated 5 months ago
- Foundational model for human-like, expressive TTS☆4,199Jul 30, 2024Updated last year
- The SDK For Browser Agents☆23,071Updated this week
- Your AI second brain. Self-hostable. Get answers from the web or your docs. Build custom agents, schedule automations, do deep research. …☆35,119Mar 26, 2026Updated 2 months ago
- Zero-Shot Speech Editing and Text-to-Speech in the Wild☆8,490May 30, 2026Updated 2 weeks ago
- Scira (Formerly MiniPerplx) is a minimalistic AI-powered search engine that helps you find information on the internet and cites it too. …☆11,715Mar 20, 2026Updated 2 months ago
- Build local voice agents with open-source models☆4,874Updated this week
- Deploy to Railway using AI coding agents - Free Credits Offer • AdUse Claude Code, Codex, OpenCode, and more. Autonomous software development now has the infrastructure to match with Railway.
- tiny vision language model☆9,760Apr 20, 2026Updated last month
- StyleTTS 2: Towards Human-Level Text-to-Speech through Style Diffusion and Adversarial Training with Large Speech Language Models☆6,288Aug 10, 2024Updated last year
- An open-source RAG-based tool for chatting with your documents.☆25,467Jun 9, 2026Updated last week
- Automate browser based workflows with AI☆21,870Jun 10, 2026Updated last week
- YC (S26) | AI that knows what you've seen, said, or heard. Records everything you do, say, hear 24/7, local, private, secure☆19,305Updated this week
- 🚀🤖 Crawl4AI: Open-source LLM Friendly Web Crawler & Scraper. Don't be shy, join here: https://discord.gg/jP8KfhDhyN☆68,181Jun 4, 2026Updated last week
- 🤖 Build voice-based LLM agents. Modular + open source.☆3,761Nov 15, 2024Updated last year
- Vane is an AI-powered answering engine.☆35,308Apr 11, 2026Updated 2 months ago
- Instant voice cloning by MIT and MyShell. Audio foundation model.☆36,705Apr 19, 2025Updated last year
- Wordpress hosting with auto-scaling - Free Trial Offer • AdFully Managed hosting for WordPress and WooCommerce businesses that need reliable, auto-scalable performance. Cloudways SafeUpdates now available.
- 🙌 OpenHands: AI-Driven Development☆77,312Updated this week
- Silero VAD: pre-trained enterprise-grade Voice Activity Detector☆9,313Mar 26, 2026Updated 2 months ago
- Universal memory layer for AI Agents☆58,243Jun 10, 2026Updated last week
- NeMo Retriever Library is a scalable, performance-oriented document content and metadata extraction microservice. NeMo Retriever Library …☆2,936Updated this week
- Unsloth Studio is a web UI for training and running open models like Gemma 4, Qwen3.6, DeepSeek, gpt-oss locally.☆66,620Updated this week
- Platform for stateful agents: AI with advanced memory that can learn and self-improve over time.☆23,318May 14, 2026Updated last month
- LLM-powered multiagent persona simulation for imagination enhancement and business insights.☆7,466May 7, 2026Updated last month