QuentinFuxa / WhisperLiveKitLinks
Real-time & local speech-to-text, translation, and speaker diarization. With server & web UI.
☆7,431Updated last week
Alternatives and similar repositories for WhisperLiveKit
Users that are interested in WhisperLiveKit are comparing it to the libraries listed below
Sorting:
- The python library for real-time communication☆4,326Updated 2 weeks ago
- Frontier Open-Source Text-to-Speech☆9,439Updated last month
- A text-to-speech (TTS), speech-to-text (STT) and speech-to-speech (STS) library built on Apple's MLX framework, providing efficient speec…☆2,705Updated last week
- Local-first AI Notepad for Private Meetings☆6,248Updated last week
- Bytebot is a self-hosted AI desktop agent that automates computer tasks through natural language commands, operating within a containeriz…☆8,143Updated 3 weeks ago
- A robust, efficient, low-latency speech-to-text library with advanced voice activity detection, wake word activation and instant transcri…☆8,689Updated 2 months ago
- ☆5,972Updated last month
- Gradio WebUI for creators and developers, featuring key TTS (Edge-TTS, kokoro) and zero-shot Voice Cloning (E2 & F5-TTS, CosyVoice), with…☆4,835Updated 2 months ago
- Have a natural, spoken conversation with AI!☆3,219Updated 2 months ago
- Kimi-Audio, an open-source audio foundation model excelling in audio understanding, generation, and conversation☆4,277Updated 3 months ago
- Super Magic. The first open-source all-in-one AI productivity platform (Generalist AI Agent + Workflow Engine + IM + Online collaborative…☆4,264Updated 2 weeks ago
- https://hf.co/hexgrad/Kokoro-82M☆4,492Updated 2 months ago
- Generate audiobooks from EPUBs, PDFs and text with synchronized captions.☆3,660Updated 2 weeks ago
- Towards Human-Sounding Speech☆5,597Updated 5 months ago
- A free and open source, self hosted Ai based live meeting note taker and minutes summary generator that can completely run in your Local …☆7,647Updated this week
- SoTA open-source TTS☆13,628Updated last week
- zero-shot voice conversion & singing voice conversion, with real-time support☆3,287Updated 5 months ago
- Embedding Atlas is a tool that provides interactive visualizations for large embeddings. It allows you to visualize, cross-filter, and se…☆3,892Updated last week
- Video translation and dubbing tool powered by LLMs. The video translator offers 100 language translations and one-click full-process depl…☆8,590Updated 3 weeks ago
- Catalog of official Microsoft MCP (Model Context Protocol) server implementations for AI-powered data access and tool integration☆1,918Updated this week
- A fast multimodal LLM for real-time voice☆4,211Updated last month
- State-of-the-art TTS model under 25MB 😻☆8,754Updated last month
- Voice Activity Detector (VAD) : low-latency, high-performance and lightweight☆1,476Updated 3 weeks ago
- Kyutai's Speech-To-Text and Text-To-Speech models based on the Delayed Streams Modeling framework.☆2,413Updated 2 weeks ago
- The world's first open-source multimodal creative assistant This is a substitute for Canva and Manus that prioritizes privacy and is usa…☆4,895Updated last week
- Modern Backend Framework that unifies APIs, background jobs, workflows, and AI Agents into a single core primitive with built-in observab…☆8,735Updated last week
- Multilingual Document Layout Parsing in a Single Vision-Language Model☆4,850Updated 3 weeks ago
- Open Source Application for Advanced LLM + Diffusion Engineering: interact, train, fine-tune, and evaluate large language models on your …☆4,380Updated this week
- 💖🧸 Self hosted, you owned Grok Companion, a container of souls of waifu, cyber livings to bring them into our worlds, wishing to achiev…☆14,662Updated this week
- 🔥 基于大模型和 RAG 的智能问数系统。Text-to-SQL Generation via LLMs using RAG.☆3,590Updated last week