fixie-ai / ultravox
A fast multimodal LLM for real-time voice
☆847Updated this week
Related projects: ⓘ
- WhisperFusion builds upon the capabilities of WhisperLive and WhisperSpeech to provide a seamless conversations with an AI.☆1,509Updated last month
- Whisper with Medusa heads☆774Updated last week
- Build real-time multimodal AI applications 🤖🎙️📹☆1,053Updated this week
- Local voice chatbot for engaging conversations, powered by Ollama, Hugging Face Transformers, and Coqui TTS Toolkit☆677Updated last month
- An extremely fast implementation of whisper optimized for Apple Silicon using MLX.☆519Updated 4 months ago
- Open Source framework for voice and multimodal conversational AI☆3,044Updated this week
- Joint speech-language model - respond directly to audio!☆312Updated 2 months ago
- LLaMA-Omni is a low-latency and high-quality end-to-end speech interaction model built upon Llama-3.1-8B-Instruct, aiming to achieve spee…☆891Updated last week
- The Open Source Memory Layer For Autonomous Agents☆1,390Updated last week
- Python & JS/TS SDK for running AI-generated code/code interpreting in your AI app☆1,097Updated this week
- ☆419Updated this week
- Example UI implementing the RTVI web client☆468Updated last month
- End-to-end platform for building voice first multimodal agents☆377Updated 3 weeks ago
- High-performance retrieval engine for unstructured data☆778Updated this week
- Build and query dynamic, temporally-aware Knowledge Graphs☆572Updated this week
- ☆1,079Updated 2 months ago
- Local SRT/LLM/TTS Voicechat☆471Updated last month
- turnkey self-hosted offline transcription and diarization service with llm summary☆689Updated 3 months ago
- An application for running LLMs locally on your device, with your documents, facilitating detailed citations in generated responses.☆459Updated this week
- Command Your World with Voice☆368Updated 3 weeks ago
- Suno AI's Bark model in C/C++ for fast text-to-speech☆684Updated 2 months ago
- Stateful load balancer custom-tailored for llama.cpp☆518Updated this week
- LLM Analytics☆593Updated last month
- Neum AI is a best-in-class framework to manage the creation and synchronization of vector embeddings at large scale.☆821Updated 8 months ago
- Minimal extension of OpenAI's Whisper adding speaker diarization with special tokens☆421Updated 10 months ago
- ☆366Updated last month
- BAML is a language that helps you get structured data from LLMs, with the best DX possible. Check out the promptfiddle.com playground☆987Updated this week
- Vision utilities for web interaction agents 👀☆1,373Updated this week
- DOM to Semantic-Markdown for use with LLMs☆630Updated this week
- Deepgram Conversational AI demo☆324Updated 2 weeks ago