fixie-ai / ultravoxView external linksLinks
A fast multimodal LLM for real-time voice
β4,350Dec 12, 2025Updated 2 months ago
Alternatives and similar repositories for ultravox
Users that are interested in ultravox are comparing it to the libraries listed below
Sorting:
- Open Source framework for voice and multimodal conversational AIβ10,263Updated this week
- A framework for building realtime voice AI agents π€ποΈπΉβ9,324Updated this week
- A robust, efficient, low-latency speech-to-text library with advanced voice activity detection, wake word activation and instant transcriβ¦β9,454Jul 11, 2025Updated 7 months ago
- Moshi is a speech-text foundation model and full-duplex spoken dialogue framework. It uses Mimi, a state-of-the-art streaming neural audiβ¦β9,593Updated this week
- SOTA Open Source TTSβ24,863Feb 2, 2026Updated last week
- Open-source framework for conversational voice AI agentsβ9,859Updated this week
- Inference and training library for high-quality TTS models.β5,528Dec 10, 2024Updated last year
- A Gemini 2.5 Flash Level MLLM for Vision, Speech, and Full-Duplex Multimodal Live Streaming on Your Phoneβ23,756Updated this week
- Build multi-agent systems that learn and improve with every interaction.β37,691Updated this week
- An LLM-powered knowledge curation system that researches a topic and generates a full-length report with citations.β27,897Sep 30, 2025Updated 4 months ago
- Amphion (/Γ¦mΛfaΙͺΙn/) is a toolkit for Audio, Music, and Speech Generation. Its purpose is to support reproducible research and help junioβ¦β9,687May 27, 2025Updated 8 months ago
- No-code LLM Platform to launch APIs and ETL Pipelines to structure unstructured documentsβ6,095Updated this week
- first base model for full-duplex conversational audioβ1,773Jan 5, 2025Updated last year
- Foundational model for human-like, expressive TTSβ4,190Jul 30, 2024Updated last year
- Zero-Shot Speech Editing and Text-to-Speech in the Wildβ8,465Mar 15, 2025Updated 11 months ago
- β‘οΈ GenBI (Generative BI) queries any database in natural language, generates accurate SQL (Text-to-SQL), charts (Text-to-Chart), and AI-pβ¦β14,380Updated this week
- The python library for real-time communicationβ4,519Jan 12, 2026Updated last month
- LLaMA-Omni is a low-latency and high-quality end-to-end speech interaction model built upon Llama-3.1-8B-Instruct, aiming to achieve speeβ¦β3,123May 19, 2025Updated 8 months ago
- The AI Browser Automation Frameworkβ21,077Updated this week
- Automate browser based workflows with AIβ20,399Updated this week
- Your AI second brain. Self-hostable. Get answers from the web or your docs. Build custom agents, schedule automations, do deep research. β¦β32,468Jan 6, 2026Updated last month
- Local realtime voice AIβ2,429Nov 26, 2025Updated 2 months ago
- Scira (Formerly MiniPerplx) is a minimalistic AI-powered search engine that helps you find information on the internet and cites it too. β¦β11,412Updated this week
- An open-source RAG-based tool for chatting with your documents.β25,019Jul 4, 2025Updated 7 months ago
- screenpipe turns your computer into a personal AI that knows everything you've done. record. search. automate. all local, all private, alβ¦β16,810Updated this week
- tiny vision language modelβ9,329Nov 14, 2025Updated 3 months ago
- StyleTTS 2: Towards Human-Level Text-to-Speech through Style Diffusion and Adversarial Training with Large Speech Language Modelsβ6,162Aug 10, 2024Updated last year
- Speech To Speech: an effort for an open-sourced and modular GPT4-oβ4,416Feb 6, 2026Updated last week
- Perplexica is an AI-powered answering engine.β28,872Jan 10, 2026Updated last month
- LLM-powered multiagent persona simulation for imagination enhancement and business insights.β7,231Updated this week
- Letta is the platform for building stateful agents: AI with advanced memory that can learn and self-improve over time.β21,024Jan 29, 2026Updated 2 weeks ago
- PraisonAI is a production-ready Multi AI Agents framework, designed to create AI Agents to automate and solve problems ranging from simplβ¦β5,599Feb 5, 2026Updated last week
- Universal memory layer for AI Agentsβ47,230Feb 3, 2026Updated last week
- Instant voice cloning by MIT and MyShell. Audio foundation model.β35,918Apr 19, 2025Updated 9 months ago
- Fine-tuning & Reinforcement Learning for LLMs. π¦₯ Train OpenAI gpt-oss, DeepSeek, Qwen, Llama, Gemma, TTS 2x faster with 70% less VRAM.β51,922Updated this week
- ππ€ Crawl4AI: Open-source LLM Friendly Web Crawler & Scraper. Don't be shy, join here: https://discord.gg/jP8KfhDhyNβ59,947Updated this week
- File Parser optimised for LLM Ingestion with no loss π§ Parse PDFs, Docx, PPTx in a format that is ideal for LLMs.β7,275Feb 21, 2025Updated 11 months ago
- 20+ high-performance LLMs with recipes to pretrain, finetune and deploy at scale.β13,155Feb 8, 2026Updated last week
- π OpenHands: AI-Driven Developmentβ67,779Updated this week