bungerr / faster-whisper-3
Faster Whisper transcription with CTranslate2
☆8Updated last year
Alternatives and similar repositories for faster-whisper-3
Users that are interested in faster-whisper-3 are comparing it to the libraries listed below
Sorting:
- Faster distil-whisper transcription with CTranslate2☆14Updated last year
- StyleTTS 2 Optimized Training Fork☆28Updated 3 months ago
- An unofficial PyTorch implementation of VALL-E☆87Updated last week
- A high-throughput and memory-efficient inference and serving engine for Whisper, https://mesolitica.com/blog/vllm-whisper☆26Updated 9 months ago
- The accompanying code for "Exploring the limits of decoder-only models trained on public speech recognition corpora" (Ankit Gupta, George…☆19Updated 7 months ago
- Triton backend for https://github.com/OpenNMT/CTranslate2☆35Updated last year
- A lightweight, efficient variation of the StyleTTS 2 text‐to‐speech model.☆16Updated last week
- Open TTS models, built for streaming on the edge☆41Updated 2 months ago
- An espeak-compatible, permissively-licensed IPA phonemizer (G2P) based on DeepPhonemizer. Usable as a drop-in replacement for espeak's GP…☆95Updated 7 months ago
- High quality text-to-speech based on StyleTTS 2.☆42Updated this week
- ☆11Updated last week
- This is a fork of the original fairseq repository (version 0.12.2) with added classes for training mHuBERT-147.☆17Updated 5 months ago
- Real-time processing and delivery of sentences from a continuous stream of characters or text chunks.☆55Updated last month
- StyleTTS2 + Vocos as a Decoder☆11Updated last month
- 🌼 Daisy-TTS: Simulating Wider Spectrum of Emotions via Prosody Embedding Decomposition☆15Updated last year
- Japanese Dataset to Multi Language TTS (Only for Japanese Dataset)☆3Updated last year
- Hanasu is a human-like TTS model based on the multilingual Himitsu V1 transformer-based encoder and VITS architecture☆28Updated last month
- ☆20Updated 6 years ago
- (WIP) A retrain of F5-TTS on permissively-licensed data☆11Updated last month
- Audio tokenization, in the fastest way possible!☆52Updated 8 months ago
- The Vokan Architecture (Tsukasa speech based)☆9Updated 3 months ago
- a Frontier Japanese Speech Generation net☆34Updated last week
- Official Demo Page for DiTTo-TTS: Efficient and Scalable Zero-Shot Text-to-Speech with Diffusion Transformer☆33Updated 3 months ago
- Create an LJSpeech structured voice dataset on wave input☆29Updated 7 months ago
- LiteASR: Efficient Automatic Speech Recognition with Low-Rank Approximation☆105Updated last month
- ☆8Updated 2 years ago
- Faster Whisper ASR transcription with CTranslate2☆20Updated 6 months ago
- llmon-py is a multimodal webui for Llama 3-8B.☆16Updated 10 months ago
- ☆53Updated 3 months ago
- End-To-End SpeechSynthesis system with knowledge distillation☆16Updated 2 years ago