pipecat-ai / pipecat
Open Source framework for voice and multimodal conversational AI
β5,937Updated this week
Alternatives and similar repositories for pipecat:
Users that are interested in pipecat are comparing it to the libraries listed below
- A powerful framework for building realtime voice AI agents π€ποΈπΉβ5,807Updated this week
- A fast multimodal LLM for real-time voiceβ3,896Updated 2 months ago
- Converts text to speech in realtimeβ2,983Updated this week
- Speech To Speech: an effort for an open-sourced and modular GPT4-oβ4,004Updated 3 weeks ago
- Fast and accurate automatic speech recognition (ASR) for edge devicesβ2,696Updated 2 months ago
- Local realtime voice AIβ2,287Updated 2 months ago
- Towards Human-Sounding Speechβ4,633Updated 3 weeks ago
- Python SDK for AI agent monitoring, LLM cost tracking, benchmarking, and more. Integrates with most LLMs and agent frameworks including Oβ¦β4,325Updated this week
- A robust, efficient, low-latency speech-to-text library with advanced voice activity detection, wake word activation and instant transcriβ¦β6,987Updated this week
- An Open Source Python alternative to NotebookLM's podcast feature: Transforming Multimodal Content into Captivating Multilingual Audio Coβ¦β3,660Updated 2 months ago
- Moshi is a speech-text foundation model and full-duplex spoken dialogue framework. It uses Mimi, a state-of-the-art streaming neural audiβ¦β8,162Updated this week
- A language model programming library.β5,754Updated 2 months ago
- structured outputs for llmsβ10,311Updated this week
- π₯ Open Source Browser API for AI Agents & Apps. Steel Browser is a batteries-included browser instance that lets you automate the web wiβ¦β4,292Updated this week
- A visual playground for agentic workflows: Iterate over your agents 10x fasterβ4,818Updated last month
- The python library for real-time communicationβ3,824Updated 2 weeks ago
- tiny vision language modelβ7,882Updated 3 weeks ago
- Educational framework exploring ergonomic, lightweight multi-agent orchestration. Managed by OpenAI Solution team.β19,718Updated last month
- File Parser optimised for LLM Ingestion with no loss π§ Parse PDFs, Docx, PPTx in a format that is ideal for LLMs.β6,377Updated 2 months ago
- Build Real-Time Knowledge Graphs for AI Agentsβ8,122Updated this week
- Inference and training library for high-quality TTS models.β5,229Updated 4 months ago
- LLaMA-Omni is a low-latency and high-quality end-to-end speech interaction model built upon Llama-3.1-8B-Instruct, aiming to achieve speeβ¦β2,907Updated 2 weeks ago
- Automate browser-based workflows with LLMs and Computer Visionβ13,237Updated this week
- https://hf.co/hexgrad/Kokoro-82Mβ2,591Updated 3 weeks ago
- Agent Framework / shim to use Pydantic with LLMsβ9,143Updated this week
- π€ Build voice-based LLM agents. Modular + open source.β3,295Updated 5 months ago
- WhisperFusion builds upon the capabilities of WhisperLive and WhisperSpeech to provide a seamless conversations with an AI.β1,598Updated 9 months ago
- Agno is a lightweight library for building Agents with memory, knowledge, tools and reasoning.β26,158Updated this week
- π§ Open source LLM observability platform. One line of code to monitor, evaluate, and experiment. YC W23 πβ3,715Updated this week
- Large Action Model framework to develop AI Web Agentsβ6,036Updated 3 months ago