kyutai-labs / pocket-ttsLinks
A TTS that fits in your CPU (and pocket)
☆2,683Updated last week
Alternatives and similar repositories for pocket-tts
Users that are interested in pocket-tts are comparing it to the libraries listed below
Sorting:
- Lightning-Fast, On-Device, Multilingual TTS — running natively via ONNX.☆2,552Updated 2 weeks ago
- TTS model capable of streaming conversational audio in realtime.☆1,027Updated 2 months ago
- Soprano: Instant, Ultra-Realistic Text-to-Speech☆1,137Updated 3 weeks ago
- A high-quality rapid TTS voice cloning model that reaches speeds of 150x realtime.☆632Updated last week
- Omnilingual ASR Open-Source Multilingual SpeechRecognition for 1600+ Languages☆2,620Updated last month
- Make text LLMs listen and speak☆1,152Updated last week
- On-device TTS model by Neuphonic☆4,718Updated 3 weeks ago
- Self-host the powerful Chatterbox TTS model. This server offers a user-friendly Web UI, flexible API endpoints (incl. OpenAI compatible),…☆996Updated last month
- ☆385Updated 3 months ago
- VibeVoice: Expressive, longform conversational speech synthesis. (Community fork)☆964Updated last week
- Local, OpenAI-compatible text-to-speech (TTS) API using Chatterbox, enabling users to generate voice cloned speech anywhere the OpenAI AP…☆514Updated last month
- ☆511Updated this week
- A CLI text-to-speech tool using the Kokoro model, supporting multiple languages, voices (with blending), and various input formats includ…☆1,148Updated last month
- Build AI applications that can see, hear, and speak using your screens, microphones, and cameras as inputs.☆1,078Updated last month
- Run Orpheus 3B Locally With LM Studio☆510Updated 10 months ago
- Kyutai's Speech-To-Text and Text-To-Speech models based on the Delayed Streams Modeling framework.☆2,822Updated last week
- PersonaPlex code.☆4,504Updated last week
- Optimized Whisper models for streaming and on-device use☆816Updated this week
- Whisper-Flow is a framework designed to enable real-time transcription of audio content using OpenAI’s Whisper model. Rather than process…☆557Updated this week
- The repository provides code for running inference with the Meta Segment Anything Audio Model (SAM-Audio), links for downloading the trai…☆3,256Updated last month
- A lightweight text-to-speech model with zero-shot voice cloning☆764Updated 3 weeks ago
- Open-source framework for developing real-time multimodal conversational AI agents.☆587Updated last week
- A lightning fast audio upsampler.☆664Updated 2 weeks ago
- A high quality and fast TTS repository☆486Updated last month
- Qwen3-TTS is an open-source series of TTS models developed by the Qwen team at Alibaba Cloud, supporting stable, expressive, and streamin…☆6,204Updated last week
- A real-time silent speech recognition tool.☆694Updated 3 months ago
- High-performance Text-to-Speech server with OpenAI-compatible API, 8 voices, emotion tags, and modern web UI. Optimized for RTX GPUs.☆648Updated 7 months ago
- Fast Streaming TTS with Orpheus + WebRTC (with FastRTC)☆347Updated 9 months ago
- Official Python toolkit for the Qwen3-ASR API. Parallel high‑throughput calls, robust long‑audio transcription, multi‑sample‑rate support…☆800Updated 3 months ago
- An open-source implementation of Whisper☆477Updated 3 months ago