fixie-ai / ultravox
A fast multimodal LLM for real-time voice
☆3,589Updated last week
Alternatives and similar repositories for ultravox:
Users that are interested in ultravox are comparing it to the libraries listed below
- Build real-time multimodal AI applications 🤖🎙️📹☆5,110Updated this week
- Open Source framework for voice and multimodal conversational AI☆4,778Updated this week
- Local realtime voice AI☆2,224Updated this week
- Fast and accurate automatic speech recognition (ASR) for edge devices☆2,564Updated 2 weeks ago
- first base model for full-duplex conversational audio☆1,707Updated last month
- LLaMA-Omni is a low-latency and high-quality end-to-end speech interaction model built upon Llama-3.1-8B-Instruct, aiming to achieve spee…☆2,809Updated 3 months ago
- Moshi is a speech-text foundation model and full-duplex spoken dialogue framework. It uses Mimi, a state-of-the-art streaming neural audi…☆7,506Updated last week
- File Parser optimised for LLM Ingestion with no loss 🧠 Parse PDFs, Docx, PPTx in a format that is ideal for LLMs.☆5,533Updated this week
- Speech To Speech: an effort for an open-sourced and modular GPT4-o☆3,753Updated 2 months ago
- No-code LLM Platform to launch APIs and ETL Pipelines to structure unstructured documents☆4,431Updated this week
- 🔥 Open Source Browser API for AI Agents & Apps. Steel Browser is a batteries-included browser instance that lets you automate the web wi…☆3,650Updated this week
- An Open Source Python alternative to NotebookLM's podcast feature: Transforming Multimodal Content into Captivating Multilingual Audio Co…☆3,173Updated this week
- Flexible and powerful framework for managing multiple AI agents and handling complex conversations☆4,206Updated this week
- A robust, efficient, low-latency speech-to-text library with advanced voice activity detection, wake word activation and instant transcri…☆5,920Updated this week
- Task-Aware Agent-driven Prompt Optimization Framework☆2,817Updated last month
- Your agent in your terminal, equipped with local tools: writes code, uses the terminal, browses the web, vision.☆3,244Updated this week
- Scira (Formerly MiniPerplx) is a minimalistic AI-powered search engine that helps you find information on the internet. Powered by Vercel…☆6,318Updated this week
- NVIDIA Ingest is an early access set of microservices for parsing hundreds of thousands of complex, messy unstructured PDFs and other ent…☆2,533Updated this week
- 📃 A better UX for chat, writing content, and coding with LLMs.☆3,898Updated last week
- PraisonAI is a production-ready Multi AI Agents framework, designed to create AI Agents to automate and solve problems ranging from simpl…☆3,446Updated last week
- A pattern for an always on AI Assistant powered by Deepseek-V3, RealtimeSTT, and Typer for engineering☆856Updated last month
- LLM-powered multiagent persona simulation for imagination enhancement and business insights.☆5,503Updated last week
- The first AI agent that builds permissionless integrations through reverse engineering platforms' internal APIs.☆4,083Updated last week
- Convert any PDF into a podcast episode!☆2,045Updated 2 months ago
- Inference and training library for high-quality TTS models.☆5,025Updated 2 months ago
- Dockerized FastAPI wrapper for Kokoro-82M text-to-speech model w/CPU ONNX and NVIDIA GPU PyTorch support, handling, and auto-stitching☆1,604Updated last week
- A modular voice assistant application for experimenting with state-of-the-art transcription, response generation, and text-to-speech mode…☆907Updated 3 months ago