A fast multimodal LLM for real-time voice
β4,368Dec 12, 2025Updated 2 months ago
Alternatives and similar repositories for ultravox
Users that are interested in ultravox are comparing it to the libraries listed below
Sorting:
- Open Source framework for voice and multimodal conversational AIβ10,529Updated this week
- A framework for building realtime voice AI agents π€ποΈπΉβ9,562Updated this week
- A robust, efficient, low-latency speech-to-text library with advanced voice activity detection, wake word activation and instant transcriβ¦β9,502Jul 11, 2025Updated 7 months ago
- Moshi is a speech-text foundation model and full-duplex spoken dialogue framework. It uses Mimi, a state-of-the-art streaming neural audiβ¦β9,750Feb 12, 2026Updated 3 weeks ago
- SOTA Open Source TTSβ25,078Feb 2, 2026Updated last month
- Open-source framework for conversational voice AI agentsβ10,094Updated this week
- Inference and training library for high-quality TTS models.β5,541Dec 10, 2024Updated last year
- A Gemini 2.5 Flash Level MLLM for Vision, Speech, and Full-Duplex Multimodal Live Streaming on Your Phoneβ24,027Feb 23, 2026Updated last week
- An LLM-powered knowledge curation system that researches a topic and generates a full-length report with citations.β27,949Sep 30, 2025Updated 5 months ago
- Build, run, manage agentic software at scale.β38,276Mar 1, 2026Updated last week
- Amphion (/Γ¦mΛfaΙͺΙn/) is a toolkit for Audio, Music, and Speech Generation. Its purpose is to support reproducible research and help junioβ¦β9,706May 27, 2025Updated 9 months ago
- No-code LLM Platform to launch APIs and ETL Pipelines to structure unstructured documentsβ6,452Mar 1, 2026Updated last week
- first base model for full-duplex conversational audioβ1,783Jan 5, 2025Updated last year
- Zero-Shot Speech Editing and Text-to-Speech in the Wildβ8,463Mar 15, 2025Updated 11 months ago
- Foundational model for human-like, expressive TTSβ4,198Jul 30, 2024Updated last year
- β‘οΈ GenBI (Generative BI) queries any database in natural language, generates accurate SQL (Text-to-SQL), charts (Text-to-Chart), and AI-pβ¦β14,528Updated this week
- LLaMA-Omni is a low-latency and high-quality end-to-end speech interaction model built upon Llama-3.1-8B-Instruct, aiming to achieve speeβ¦β3,128May 19, 2025Updated 9 months ago
- The python library for real-time communicationβ4,535Jan 12, 2026Updated last month
- The AI Browser Automation Frameworkβ21,356Updated this week
- Scira (Formerly MiniPerplx) is a minimalistic AI-powered search engine that helps you find information on the internet and cites it too. β¦β11,492Feb 10, 2026Updated 3 weeks ago
- Automate browser based workflows with AIβ20,629Updated this week
- Your AI second brain. Self-hostable. Get answers from the web or your docs. Build custom agents, schedule automations, do deep research. β¦β32,752Feb 24, 2026Updated last week
- Local realtime voice AIβ2,434Nov 26, 2025Updated 3 months ago
- An open-source RAG-based tool for chatting with your documents.β25,168Updated this week
- screenpipe turns your computer into a personal AI that knows everything you've done. record. search. automate. all local, all private, alβ¦β17,068Updated this week
- tiny vision language modelβ9,386Nov 14, 2025Updated 3 months ago
- StyleTTS 2: Towards Human-Level Text-to-Speech through Style Diffusion and Adversarial Training with Large Speech Language Modelsβ6,187Aug 10, 2024Updated last year
- Speech To Speech: an effort for an open-sourced and modular GPT4-oβ4,486Updated this week
- LLM-powered multiagent persona simulation for imagination enhancement and business insights.β7,295Feb 27, 2026Updated last week
- Perplexica is an AI-powered answering engine.β30,120Feb 13, 2026Updated 3 weeks ago
- Letta is the platform for building stateful agents: AI with advanced memory that can learn and self-improve over time.β21,340Feb 24, 2026Updated last week
- PraisonAI is a production-ready Multi AI Agents framework, designed to create AI Agents to automate and solve problems ranging from simplβ¦β5,608Mar 1, 2026Updated last week
- Universal memory layer for AI Agentsβ48,604Updated this week
- Instant voice cloning by MIT and MyShell. Audio foundation model.β36,025Apr 19, 2025Updated 10 months ago
- Fine-tuning & Reinforcement Learning for LLMs. π¦₯ Train OpenAI gpt-oss, DeepSeek, Qwen, Llama, Gemma, TTS 2x faster with 70% less VRAM.β53,029Updated this week
- ππ€ Crawl4AI: Open-source LLM Friendly Web Crawler & Scraper. Don't be shy, join here: https://discord.gg/jP8KfhDhyNβ61,332Updated this week
- 20+ high-performance LLMs with recipes to pretrain, finetune and deploy at scale.β13,206Updated this week
- File Parser optimised for LLM Ingestion with no loss π§ Parse PDFs, Docx, PPTx in a format that is ideal for LLMs.β7,343Feb 21, 2025Updated last year
- π OpenHands: AI-Driven Developmentβ68,459Updated this week