Shadowfita / parakeet-tdt-0.6b-v2-fastapiLinks
A FastAPI wrapper for NVIDIA's new parakeet 0.6b v2 TTS 600-million-parameter model designed for high-quality English speech recognition
☆110Updated 4 months ago
Alternatives and similar repositories for parakeet-tdt-0.6b-v2-fastapi
Users that are interested in parakeet-tdt-0.6b-v2-fastapi are comparing it to the libraries listed below
Sorting:
- Interface for OuteTTS models.☆1,390Updated 4 months ago
- High-performance Text-to-Speech server with OpenAI-compatible API, 8 voices, emotion tags, and modern web UI. Optimized for RTX GPUs.☆578Updated 3 months ago
- Run Orpheus 3B Locally With LM Studio☆478Updated 7 months ago
- G2P☆334Updated 2 months ago
- Open source inference code for Rev's model☆432Updated 6 months ago
- Verbatim Automatic Speech Recognition with improved word-level timestamps and filler detection☆847Updated 4 months ago
- LLaSA: Scaling Train-time and Inference-time Compute for LLaMA-based Speech Synthesis☆622Updated 6 months ago
- A CLI text-to-speech tool using the Kokoro model, supporting multiple languages, voices (with blending), and various input formats includ…☆836Updated last month
- A Fast TTS Engine☆555Updated 9 months ago
- Streaming and Fine-tuning for Chatterbox TTS☆200Updated 4 months ago
- ☆237Updated last week
- Realtime demo, Streaming and Finetuning code for CSM☆405Updated last month
- An Optimized Speech-to-Text Pipeline for the Whisper Model Supporting Multiple Inference Engine☆474Updated last year
- Running the F5-TTS by ONNX Runtime☆179Updated last month
- ☆466Updated 5 months ago
- Self-host the powerful Chatterbox TTS model. This server offers a user-friendly Web UI, flexible API endpoints (incl. OpenAI compatible),…☆584Updated 3 months ago
- speechlib is a library that can do speaker diarization, transcription and speaker recognition on an audio file to create transcripts with…☆238Updated 2 months ago
- Hibiki is a model for streaming speech translation (also known as simultaneous translation). Unlike offline translation—where one waits f…☆1,301Updated 6 months ago
- A local implementation of the Kokoro Text-to-Speech model, featuring dynamic module loading, automatic dependency management, and a web i…☆225Updated 2 months ago
- ☆222Updated 2 weeks ago
- Open Audio Watermarking Tool☆356Updated 4 months ago
- A real-time speech-to-speech chatbot powered by Whisper Small, Llama 3.2, and Kokoro-82M.☆245Updated 9 months ago
- FastAPI service on top of WhisperX☆141Updated this week
- Speech-to-speech AI assistant with natural conversation flow, mid-speech interruption, vision capabilities and AI-initiated follow-ups. F…☆251Updated 6 months ago
- VibeVoice: Expressive, longform conversational speech synthesis. (Community fork)☆632Updated this week
- ☆346Updated last year
- Kyutai's Speech-To-Text and Text-To-Speech models based on the Delayed Streams Modeling framework.☆2,493Updated last month
- Implementation of F5-TTS in MLX☆592Updated 7 months ago
- Examples of using the llasa-tts models locally☆181Updated 6 months ago
- ☆524Updated 3 weeks ago