hathibelagal-dev / str2speech
An easy-to-use library and command-line tool for TTS
☆14Updated this week
Alternatives and similar repositories for str2speech:
Users that are interested in str2speech are comparing it to the libraries listed below
- Streaming Markdown parser for tui clis☆15Updated this week
- Unblur Photos with Fotor's AI Enlarger☆17Updated last year
- Hanasu is a human-like TTS model based on the multilingual Himitsu V1 transformer-based encoder and VITS architecture☆26Updated 2 weeks ago
- ☆24Updated 2 years ago
- Port of Suno AI's Bark in C/C++ for fast inference☆52Updated last year
- Transcribe audio and video files with speaker diarization and logically grouped timestamps☆19Updated last month
- Documentation for the Krixik Python client.☆38Updated 5 months ago
- Open Server is an OpenAI API Compatible Server for generating text, images, embeddings, and storing them in vector databases. It also inc…☆16Updated last year
- With a few words and a click of a button, quickly get an engaging, high quality video. (And optionally save and share it!)☆17Updated 2 months ago
- Glanceables is a handy macOS desktop app that turns parts of websites into easy-to-view widgets. This app makes it simpler to keep tabs o…☆52Updated 8 months ago
- BUD-E (Buddy) is an open-source voice assistant framework that facilitates seamless interaction with AI models and APIs, enabling the cre…☆19Updated 6 months ago
- Convert your PDFs and EPUBs into audiobooks effortlessly. Features intelligent text extraction, customizable text-to-speech settings, and…☆65Updated 3 weeks ago
- Voice agent using LiveKit (orchestration), Cartesia (TTS), OpenAI (LLM), and Deepgram (STT)☆15Updated 3 months ago
- Real-time Voice Activity Detection (VAD) with some example use case like simple voice bot and live transcription (realtime transcription)☆77Updated 10 months ago
- Turn a doc into plaintext which you can listen to using TTS☆19Updated 2 years ago
- Local11Labs allows generating high-quality text-to-speech and podcast content using the fast and tiny Kokoro-82M.☆46Updated 3 months ago
- A QT GUI for large language models☆32Updated last year
- On-device speaker diarization powered by deep learning☆44Updated last month
- Faster Whisper ASR transcription with CTranslate2☆20Updated 6 months ago
- A full-text search for YouTube subtitles and video metadata with a command line interface.☆31Updated 2 months ago
- LlamaVoice is a llama-based large voice generation model, providing inference and training ability.☆232Updated 8 months ago
- On-device streaming text-to-speech engine powered by deep learning☆76Updated this week
- Drop in replacement for OpenAI's embedding API. Self Hosted.☆53Updated last year
- An interface for llama.cpp, ChatGPT, and Gemini☆26Updated 3 weeks ago
- Search a JSON path and get the value fast☆22Updated 2 months ago
- Self hosted high quality voice recognition for de-googled Android using whisper. Like Siri or OK Google.☆63Updated last year
- Whisper STT + Orpheus TTS + Gemma 3 using LM Studio to create a virtual assistant.☆43Updated 3 weeks ago
- 360M model running in the browser on WebGPU☆21Updated 8 months ago
- Speech recognition & diarisation solution with text alignment, deployed in AML pipelines☆94Updated 11 months ago
- SEPIA server to support open-source speech recognition via WebSocket connection.☆125Updated 5 months ago