pipecat-ai / pipecatLinks
Open Source framework for voice and multimodal conversational AI
β8,776Updated this week
Alternatives and similar repositories for pipecat
Users that are interested in pipecat are comparing it to the libraries listed below
Sorting:
- A powerful framework for building realtime voice AI agents π€ποΈπΉβ8,152Updated this week
- A fast multimodal LLM for real-time voiceβ4,252Updated 2 months ago
- Local realtime voice AIβ2,375Updated 8 months ago
- Speech To Speech: an effort for an open-sourced and modular GPT4-oβ4,230Updated 6 months ago
- first base model for full-duplex conversational audioβ1,768Updated 10 months ago
- The python library for real-time communicationβ4,383Updated last month
- LLaMA-Omni is a low-latency and high-quality end-to-end speech interaction model built upon Llama-3.1-8B-Instruct, aiming to achieve speeβ¦β3,090Updated 5 months ago
- Fast and accurate automatic speech recognition (ASR) for edge devicesβ2,951Updated 3 weeks ago
- π€ Build voice-based LLM agents. Modular + open source.β3,605Updated 11 months ago
- Converts text to speech in realtimeβ3,613Updated 3 months ago
- React app for inspecting, building and debugging with the Realtime APIβ3,498Updated 2 months ago
- β1,013Updated last month
- A framework for serving and evaluating LLM routers - save LLM costs without compromising qualityβ4,393Updated last year
- Towards Human-Sounding Speechβ5,709Updated 6 months ago
- The Open Source Memory Layer For Autonomous Agentsβ2,505Updated last year
- NeMo Retriever extraction is a scalable, performance-oriented document content and metadata extraction microservice. NeMo Retriever extraβ¦β2,760Updated this week
- Voice activity detector (VAD) for the browser with a simple APIβ1,671Updated last week
- An Open Source Python alternative to NotebookLM's podcast feature: Transforming Multimodal Content into Captivating Multilingual Audio Coβ¦β5,601Updated last week
- Vision infrastructure to turn complex documents into RAG/LLM-ready dataβ2,904Updated last month
- Have a natural, spoken conversation with AI!β3,322Updated 4 months ago
- Deploy your agentic worfklows to productionβ2,059Updated 2 months ago
- Open-source framework for conversational voice AI agentsβ8,506Updated this week
- π§ Open source LLM observability platform. One line of code to monitor, evaluate, and experiment. YC W23 πβ4,712Updated this week
- Moshi is a speech-text foundation model and full-duplex spoken dialogue framework. It uses Mimi, a state-of-the-art streaming neural audiβ¦β9,073Updated last week
- RAG that intelligently adapts to your use case, data, and queriesβ3,575Updated last week
- Convert any PDF into a podcast episode!β2,492Updated 11 months ago
- ML-powered speech recognition directly in your browserβ3,140Updated last year
- The SOTA Open-Source Browser Agent for autonomously performing complex tasks on the webβ2,320Updated 5 months ago
- StyleTTS 2: Towards Human-Level Text-to-Speech through Style Diffusion and Adversarial Training with Large Speech Language Modelsβ6,034Updated last year
- The fastest way to build robust AI agentsβ1,938Updated 4 months ago