KittenML / KittenTTSLinks
State-of-the-art TTS model under 25MB 😻
☆8,881Updated last month
Alternatives and similar repositories for KittenTTS
Users that are interested in KittenTTS are comparing it to the libraries listed below
Sorting:
- Have a natural, spoken conversation with AI!☆3,245Updated 3 months ago
- https://hf.co/hexgrad/Kokoro-82M☆4,530Updated 2 months ago
- Kimi-Audio, an open-source audio foundation model excelling in audio understanding, generation, and conversation☆4,287Updated 3 months ago
- ☆5,979Updated last month
- Kyutai's Speech-To-Text and Text-To-Speech models based on the Delayed Streams Modeling framework.☆2,436Updated 3 weeks ago
- A text-to-speech (TTS), speech-to-text (STT) and speech-to-speech (STS) library built on Apple's MLX framework, providing efficient speec…☆2,723Updated 2 weeks ago
- Towards Human-Sounding Speech☆5,603Updated 5 months ago
- Dockerized FastAPI wrapper for Kokoro-82M text-to-speech model w/CPU ONNX and NVIDIA GPU PyTorch support, handling, and auto-stitching☆3,759Updated last week
- Generate audiobooks from EPUBs, PDFs and text with synchronized captions.☆3,680Updated this week
- Fast and accurate automatic speech recognition (ASR) for edge devices☆2,903Updated last month
- Kernels & AI inference engine for phone chips☆3,423Updated this week
- TTS with kokoro and onnx runtime☆2,215Updated 3 months ago
- Frontier Open-Source Text-to-Speech☆9,540Updated last month
- On-device TTS model by Neuphonic☆2,614Updated this week
- 🌐 The open-source Agentic browser; privacy-first alternative to Perplexity Comet, Arc, Dia☆4,649Updated this week
- Hibiki is a model for streaming speech translation (also known as simultaneous translation). Unlike offline translation—where one waits f…☆1,291Updated 5 months ago
- SoTA open-source TTS☆13,774Updated 2 weeks ago
- Interface for OuteTTS models.☆1,384Updated 3 months ago
- Real-time & local speech-to-text server.☆7,577Updated last week
- Local realtime voice AI☆2,370Updated 7 months ago
- ⚡ Python-free Rust inference server — OpenAI-API compatible. GGUF + SafeTensors, hot model swap, auto-discovery, single binary. FREE now,…☆2,945Updated this week
- first base model for full-duplex conversational audio☆1,765Updated 9 months ago
- Gradio WebUI for creators and developers, featuring key TTS (Edge-TTS, kokoro) and zero-shot Voice Cloning (E2 & F5-TTS, CosyVoice), with…☆4,847Updated last week
- Moshi is a speech-text foundation model and full-duplex spoken dialogue framework. It uses Mimi, a state-of-the-art streaming neural audi…☆8,978Updated last week
- Local-first AI Notepad for Private Meetings☆6,286Updated this week
- This repository contains the official implementation of "FastVLM: Efficient Vision Encoding for Vision Language Models" - CVPR 2025☆6,746Updated 5 months ago
- A CLI text-to-speech tool using the Kokoro model, supporting multiple languages, voices (with blending), and various input formats includ…☆800Updated last month
- A TTS model capable of generating ultra-realistic dialogue in one pass.☆18,568Updated 3 months ago
- Voice Activity Detector (VAD) : low-latency, high-performance and lightweight☆1,482Updated 3 weeks ago
- Make text LLMs listen and speak☆904Updated last week