speaches-ai/speaches

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/speaches-ai/speaches)

speaches-ai / speaches

☆3,535

Alternatives and similar repositories for speaches

Users that are interested in speaches are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

SYSTRAN / faster-whisper
View on GitHub
Faster Whisper transcription with CTranslate2
☆24,451Nov 19, 2025Updated 8 months ago
remsky / Kokoro-FastAPI
View on GitHub
Dockerized FastAPI wrapper for Kokoro-82M text-to-speech model w/multiplatform CPU, AMD, NVIDIA GPU PyTorch support, handling, and auto-s…
☆5,238Updated this week
ufal / whisper_streaming
View on GitHub
Whisper realtime streaming for long speech-to-text transcription and translation
☆3,653Nov 12, 2025Updated 8 months ago
m-bain / whisperX
View on GitHub
WhisperX: Automatic Speech Recognition with Word-level Timestamps (& Diarization)
☆23,189Jul 13, 2026Updated last week
collabora / WhisperLive
View on GitHub
A nearly-live implementation of OpenAI's Whisper.
☆4,147Updated this week
Bare Metal GPUs on DigitalOcean Gradient AI • Ad
Purpose-built for serious AI teams training foundational models, running large-scale inference, and pushing the boundaries of what's possible.
matatonic / openedai-speech
View on GitHub
An OpenAI API compatible text to speech server using Coqui AI's xtts_v2 and/or piper tts as the backend.
☆860Feb 2, 2025Updated last year
ahmetoner / whisper-asr-webservice
View on GitHub
OpenAI Whisper ASR Webservice API
☆3,304Nov 23, 2025Updated 7 months ago
roryeckel / wyoming_openai
View on GitHub
OpenAI-Compatible Proxy Middleware for the Wyoming Protocol
☆201Jul 3, 2026Updated 2 weeks ago
ggml-org / whisper.cpp
View on GitHub
Port of OpenAI's Whisper model in C/C++
☆52,166Jul 11, 2026Updated last week
snakers4 / silero-vad
View on GitHub
Silero VAD: pre-trained enterprise-grade Voice Activity Detector
☆9,645Updated this week
KoljaB / RealtimeTTS
View on GitHub
Converts text to speech in realtime
☆3,996May 31, 2026Updated last month
Vaibhavs10 / insanely-fast-whisper
View on GitHub
☆12,993Oct 25, 2025Updated 8 months ago
pipecat-ai / pipecat
View on GitHub
Open Source framework for voice and multimodal conversational AI
☆13,643Updated this week
BerriAI / litellm
View on GitHub
The fastest, litest AI Gateway. Rust core with Python SDK. Call 100+ LLM APIs in OpenAI (or native) format with cost tracking, guardrails…
☆54,241Updated this week
Managed hosting for WordPress and PHP on Cloudways • Ad
Managed hosting for WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Deploy in minutes on Cloudways by DigitalOcean.
mudler / LocalAI
View on GitHub
LocalAI is the open-source AI engine. Run any model - LLMs, vision, voice, image, video - on any hardware. No GPU required.
☆47,746Updated this week
travisvn / chatterbox-tts-api
View on GitHub
Local, OpenAI-compatible text-to-speech (TTS) API using Chatterbox, enabling users to generate voice cloned speech anywhere the OpenAI AP…
☆629Dec 23, 2025Updated 7 months ago
k2-fsa / sherpa-onnx
View on GitHub
Speech-to-text, text-to-speech, speaker diarization, speech enhancement, source separation, and VAD using next-gen Kaldi with onnxruntime…
☆13,712Updated this week
open-webui / open-webui
View on GitHub
User-friendly AI Interface (Supports Ollama, OpenAI API, ...)
☆146,342Updated this week
resemble-ai / chatterbox
View on GitHub
SoTA open-source TTS
☆25,633Updated this week
rhasspy / piper
View on GitHub
A fast, local neural text to speech system
☆11,254Aug 26, 2025Updated 10 months ago
mostlygeek / llama-swap
View on GitHub
Reliable model swapping for any local OpenAI/Anthropic compatible server - llama.cpp, vllm, etc
☆5,103Updated this week
Lex-au / Orpheus-FastAPI
View on GitHub
High-performance Text-to-Speech server with OpenAI-compatible API, 8 voices, emotion tags, and modern web UI. Optimized for RTX GPUs.
☆715Jul 5, 2025Updated last year
fishaudio / fish-speech
View on GitHub
SOTA Open Source TTS
☆31,354Jun 9, 2026Updated last month
Managed Kubernetes at scale on DigitalOcean • Ad
DigitalOcean Kubernetes includes the control plane, bandwidth allowance, container registry, automatic updates, and more for free.
QuentinFuxa / WhisperLiveKit
View on GitHub
Simultaneous speech-to-text models
☆10,556Updated this week
huggingface / speech-to-speech
View on GitHub
Build local voice agents with open-source models
☆6,279Updated this week
KoljaB / RealtimeSTT
View on GitHub
A robust, efficient, low-latency speech-to-text library with advanced voice activity detection, wake word activation and instant transcri…
☆10,002Jun 12, 2026Updated last month
matatonic / openedai-whisper
View on GitHub
An OpenAI API compatible speech to text server for audio transcription and translations, aka. Whisper.
☆91Feb 2, 2025Updated last year
InterfazeAI / insanely-fast-whisper-api
View on GitHub
An API to transcribe audio with OpenAI's Whisper Large v3!
☆355Nov 13, 2024Updated last year
vllm-project / vllm
View on GitHub
A high-throughput and memory-efficient inference and serving engine for LLMs
☆86,804Updated this week
livekit / agents
View on GitHub
A framework for building realtime voice AI agents 🤖🎙️📹
☆11,471Updated this week
unslothai / unsloth
View on GitHub
Unsloth is a local UI for training and running Gemma 4, Qwen3.6, DeepSeek, Kimi, GLM and other models.
☆68,666Updated this week
ItzCrazyKns / Vane
View on GitHub
Vane is an AI-powered answering engine.
☆35,839Apr 11, 2026Updated 3 months ago
Managed hosting for WordPress and PHP on Cloudways • Ad
Managed hosting for WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Deploy in minutes on Cloudways by DigitalOcean.
moonshine-ai / moonshine
View on GitHub
Very low latency speech to text, intent recognition, and text to speech, for building voice agents and interfaces
☆10,277Updated this week
av / harbor
View on GitHub
Stop configuring your AI stack. Start using it. One command brings a complete pre-wired LLM stack with hundreds of services to explore.
☆3,144Updated this week
murtaza-nasir / speakr
View on GitHub
Speakr is a personal, self-hosted web application designed for transcribing audio recordings
☆3,531Jul 15, 2026Updated last week
MODSetter / SurfSense
View on GitHub
Open-source NotebookLM alternative. Research the open web with live data, through one platform, API or MCP server. Join our Discord: http…
☆15,295Updated this week
hexgrad / kokoro
View on GitHub
https://hf.co/hexgrad/Kokoro-82M
☆8,069Aug 6, 2025Updated 11 months ago
kyutai-labs / delayed-streams-modeling
View on GitHub
Kyutai's Speech-To-Text and Text-To-Speech models based on the Delayed Streams Modeling framework.
☆2,982Jan 26, 2026Updated 5 months ago
MahmoudAshraf97 / whisper-diarization
View on GitHub
Automatic Speech Recognition with Speaker Diarization based on OpenAI Whisper
☆5,606Feb 23, 2026Updated 4 months ago