A fast multimodal LLM for real-time voice
β4,379Dec 12, 2025Updated 3 months ago
Alternatives and similar repositories for ultravox
Users that are interested in ultravox are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- Open Source framework for voice and multimodal conversational AIβ10,821Updated this week
- A framework for building realtime voice AI agents π€ποΈπΉβ9,833Updated this week
- Moshi is a speech-text foundation model and full-duplex spoken dialogue framework. It uses Mimi, a state-of-the-art streaming neural audiβ¦β9,898Mar 4, 2026Updated 3 weeks ago
- A robust, efficient, low-latency speech-to-text library with advanced voice activity detection, wake word activation and instant transcriβ¦β9,589Mar 14, 2026Updated 2 weeks ago
- A Gemini 2.5 Flash Level MLLM for Vision, Speech, and Full-Duplex Multimodal Live Streaming on Your Phoneβ24,189Mar 7, 2026Updated 3 weeks ago
- Virtual machines for every use case on DigitalOcean β’ AdGet dependable uptime with 99.99% SLA, simple security tools, and predictable monthly pricing with DigitalOcean's virtual machines, called Droplets.
- SOTA Open Source TTSβ28,614Mar 21, 2026Updated last week
- first base model for full-duplex conversational audioβ1,786Jan 5, 2025Updated last year
- Inference and training library for high-quality TTS models.β5,554Dec 10, 2024Updated last year
- Open-source framework for conversational voice AI agentsβ10,285Mar 20, 2026Updated last week
- Build, run, manage agentic software at scale.β38,835Mar 20, 2026Updated last week
- An LLM-powered knowledge curation system that researches a topic and generates a full-length report with citations.β28,028Sep 30, 2025Updated 5 months ago
- Amphion (/Γ¦mΛfaΙͺΙn/) is a toolkit for Audio, Music, and Speech Generation. Its purpose is to support reproducible research and help junioβ¦β9,721May 27, 2025Updated 10 months ago
- LLaMA-Omni is a low-latency and high-quality end-to-end speech interaction model built upon Llama-3.1-8B-Instruct, aiming to achieve speeβ¦β3,135May 19, 2025Updated 10 months ago
- LLM-Driven Extraction of Unstructured Data β Built for API Deployments & ETL Pipeline Workflowsβ6,504Mar 19, 2026Updated last week
- NordVPN Threat Protection Proβ’ β’ AdTake your cybersecurity to the next level. Block phishing, malware, trackers, and ads. Lightweight app that works with all browsers.
- β‘οΈ GenBI (Generative BI) queries any database in natural language, generates accurate SQL (Text-to-SQL), charts (Text-to-Chart), and AI-pβ¦β14,667Updated this week
- Local realtime voice AIβ2,440Nov 26, 2025Updated 4 months ago
- The python library for real-time communicationβ4,554Jan 12, 2026Updated 2 months ago
- Foundational model for human-like, expressive TTSβ4,205Jul 30, 2024Updated last year
- The AI Browser Automation Frameworkβ21,690Updated this week
- Your AI second brain. Self-hostable. Get answers from the web or your docs. Build custom agents, schedule automations, do deep research. β¦β33,546Mar 19, 2026Updated last week
- Scira (Formerly MiniPerplx) is a minimalistic AI-powered search engine that helps you find information on the internet and cites it too. β¦β11,566Mar 20, 2026Updated last week
- Zero-Shot Speech Editing and Text-to-Speech in the Wildβ8,472Mar 15, 2025Updated last year
- Build local voice agents with open-source modelsβ4,619Updated this week
- 1-Click AI Models by DigitalOcean Gradient β’ AdDeploy popular AI models on DigitalOcean Gradient GPU virtual machines with just a single click and start building anything your business needs.
- tiny vision language modelβ9,455Nov 14, 2025Updated 4 months ago
- StyleTTS 2: Towards Human-Level Text-to-Speech through Style Diffusion and Adversarial Training with Large Speech Language Modelsβ6,222Aug 10, 2024Updated last year
- An open-source RAG-based tool for chatting with your documents.β25,234Mar 8, 2026Updated 2 weeks ago
- screenpipe turns your computer into a personal AI that knows everything you've done. record. search. automate. all local, all private, alβ¦β17,389Mar 21, 2026Updated last week
- Automate browser based workflows with AIβ20,936Updated this week
- ππ€ Crawl4AI: Open-source LLM Friendly Web Crawler & Scraper. Don't be shy, join here: https://discord.gg/jP8KfhDhyNβ62,480Mar 21, 2026Updated last week
- π€ Build voice-based LLM agents. Modular + open source.β3,710Nov 15, 2024Updated last year
- Vane is an AI-powered answering engine.β33,329Mar 10, 2026Updated 2 weeks ago
- Universal memory layer for AI Agentsβ50,867Updated this week
- Managed Database hosting by DigitalOcean β’ AdPostgreSQL, MySQL, MongoDB, Kafka, Valkey, and OpenSearch available. Automatically scale up storage and focus on building your apps.
- Instant voice cloning by MIT and MyShell. Audio foundation model.β36,136Apr 19, 2025Updated 11 months ago
- Silero VAD: pre-trained enterprise-grade Voice Activity Detectorβ8,581Updated this week
- Unsloth Studio is a web UI for training and running open models like Qwen, DeepSeek, gpt-oss and Gemma locally.β57,673Updated this week
- π OpenHands: AI-Driven Developmentβ69,594Updated this week
- NeMo Retriever Library is a scalable, performance-oriented document content and metadata extraction microservice. NeMo Retriever extractiβ¦β2,885Updated this week
- Letta is the platform for building stateful agents: AI with advanced memory that can learn and self-improve over time.β21,680Mar 16, 2026Updated last week
- LLM-powered multiagent persona simulation for imagination enhancement and business insights.β7,340Feb 27, 2026Updated last month