SesameAILabs / whisperX
WhisperX: Automatic Speech Recognition with Word-level Timestamps (& Diarization)
☆47Updated 5 months ago
Alternatives and similar repositories for whisperX:
Users that are interested in whisperX are comparing it to the libraries listed below
- Faster Whisper with additional features☆39Updated 3 weeks ago
- Sesame Converse - Real Time Conversations - Powered by Gemma 3☆58Updated 2 weeks ago
- High-performance Text-to-Speech server with OpenAI-compatible API, 8 voices, emotion tags, and modern web UI. Optimized for RTX GPUs.☆160Updated last week
- Run Orpheus 3B Locally With LM Studio☆309Updated 2 weeks ago
- List of curated use cases built using Sesame's CSM 1B☆56Updated 2 weeks ago
- deep hermes, but decides how to respond based on its OWN decision, no need for system prompts.☆32Updated last month
- Sesame CSM 1B Voice Cloning☆246Updated 2 weeks ago
- OpenAI compatible TTS for Sesame CSM:1b - Voice Cloning from File/YT☆227Updated last week
- Lightweight Gradio based WebUI for orpheusTTS - WSL / Linux [CUDA]☆62Updated 2 weeks ago
- A Conversational Speech Generation Model with Gradio UI and OpenAI compatible API. UI and API support CUDA, MLX and CPU devices.☆143Updated last week
- Examples of using the llasa-tts models locally☆158Updated 2 months ago
- Open source tool for transcirption and subtitling, alternative to happyscribe.☆25Updated last month
- Since the owner of the repo took it down and it used an MIT license, I guess it's okay to upload it here for people to use.☆32Updated 3 weeks ago
- Whisper STT + Orpheus TTS + Gemma 3 using LM Studio to create a virtual assistant.☆36Updated this week
- ☆19Updated 2 months ago
- Adding a multi-text multi-speaker script (diffe) that is based on a script from asiff00 on issue 61 for Sesame: A Conversational Speech G…☆21Updated this week
- Convert your PDFs and EPUBs into audiobooks effortlessly. Features intelligent text extraction, customizable text-to-speech settings, and…☆60Updated this week
- Real-Time Transcription Using OpenAI Whisper☆118Updated last month
- Orpheus Chat WebUI☆32Updated last week
- Official code for "F5-TTS: A Fairytaler that Fakes Fluent and Faithful Speech with Flow Matching"☆77Updated 5 months ago
- API server for Instant voice cloning by MyShell.☆88Updated 6 months ago
- ☆38Updated 6 months ago
- ☆36Updated last month
- A Multi-modal MCP client for voice powered agentic workflows☆149Updated 2 months ago
- A local implementation of the Kokoro Text-to-Speech model, featuring dynamic module loading, automatic dependency management, and a web i…☆146Updated last week
- Run Orpheus 3B Locally With LM Studio☆24Updated 2 weeks ago
- II-Researcher: a new open-source framework designed to aid building search / research agents☆107Updated this week
- An implementation of the CSM(Conversation Speech Model) for Apple Silicon using MLX.☆262Updated last week
- 🐍 🤖 Pip installable package for StyleTTS 2 human-level text-to-speech and voice cloning☆156Updated 8 months ago
- Realtime tts reading of large textfiles by your favourite voice. +Translation via LLM (Python script)☆52Updated 5 months ago