coqui-ai / whisperX
WhisperX: Automatic Speech Recognition with Word-level Timestamps (& Diarization)
☆36Updated 4 months ago
Related projects ⓘ
Alternatives and complementary repositories for whisperX
- ☆20Updated 2 months ago
- An API for VoiceCraft.☆26Updated 4 months ago
- ☆87Updated 6 months ago
- A UI for the Piper TTS☆66Updated 2 months ago
- Self hosted high quality voice recognition for de-googled Android using whisper. Like Siri or OK Google.☆53Updated 10 months ago
- Speech recognition & diarisation solution with text alignment, deployed in AML pipelines☆84Updated 6 months ago
- Real-time processing and delivery of sentences from a continuous stream of characters or text chunks.☆32Updated this week
- ☆68Updated 7 months ago
- 🐍 🤖 Pip installable package for StyleTTS 2 human-level text-to-speech and voice cloning☆135Updated 3 months ago
- Mobile web app for audio "push-to-talk" + TTS chat interface with OpenAI-like APIs☆36Updated 10 months ago
- Real-Time Whisper Voice Recognition with vosk model feedback.☆105Updated last year
- ☆51Updated last month
- Open dubbing is an AI dubbing system which uses machine learning models to automatically translate and synchronize audio dialogue into di…☆57Updated this week
- A very simple implementation of edge_tts w/ RVC for oobabooga text-generation-webui.☆41Updated 9 months ago
- On-device speaker recognition engine powered by deep learning☆27Updated last week
- 100% free, local & offline voice assistant with speech recognition☆58Updated last month
- On-device streaming text-to-speech engine powered by deep learning☆54Updated last week
- Pybind11 bindings for Whisper.cpp☆45Updated last week
- Open models for Coqui STT☆122Updated last year
- streaming speech to text server using Whisper☆83Updated last year
- Using FastChat-T5 Large Language Model, Vosk API for automatic speech recognition, and Piper for text-to-speech☆110Updated last year
- Simulates talk with an AI that can express emotions☆29Updated 3 months ago
- web based editor for subtitles and transcripts☆111Updated 2 months ago
- ☆295Updated 4 months ago
- ☆35Updated last year
- A QT GUI for large language models☆24Updated 10 months ago
- A Qt GUI for large language models☆40Updated 11 months ago
- Efficient approach to speaker diarization using voice characteristics extraction☆67Updated 6 months ago
- Accepts a Hugging Face model URL, automatically downloads and quantizes it using Bits and Bytes.☆38Updated 8 months ago
- Starter repository for Deepgram Live Transcription in Flask☆21Updated this week