hexgrad / misaki
G2P
☆210Updated last week
Alternatives and similar repositories for misaki:
Users that are interested in misaki are comparing it to the libraries listed below
- LLaSA: Scaling Train-time and Inference-time Compute for LLaMA-based Speech Synthesis☆516Updated 2 weeks ago
- ☆203Updated 2 weeks ago
- Running the F5-TTS by ONNX Runtime☆146Updated 2 weeks ago
- LLMVoX: Autoregressive Streaming Text-to-Speech Model for Any LLM☆238Updated last month
- ☆356Updated 7 months ago
- Real-time Speech-Text Foundation Model Toolkit (wip)☆224Updated 3 weeks ago
- Collection of Open Source Speech Data☆153Updated 5 months ago
- Open source inference code for Rev's model☆399Updated last week
- ☆216Updated 3 weeks ago
- StyleTTS-ZS: Efficient High-Quality Zero-Shot Text-to-Speech Synthesis with Distilled Time-Varying Style Diffusion☆174Updated 6 months ago
- Codec for paper: LLaSA: Scaling Train-time and Inference-time Compute for LLaMA-based Speech Synthesis☆253Updated last month
- ☆95Updated 11 months ago
- Interface for OuteTTS models.☆1,178Updated last week
- A ggml (C++) re-implementation of tortoise-tts☆178Updated 8 months ago
- Automatically cleaning, enhancing, segmenting, filtering, and formatting a dataset to fine tune or train a voice model.☆34Updated last week
- High-performance Text-to-Speech server with OpenAI-compatible API, 8 voices, emotion tags, and modern web UI. Optimized for RTX GPUs.☆251Updated this week
- LlamaVoice is a llama-based large voice generation model, providing inference and training ability.☆232Updated 7 months ago
- Multi-Scale Neural Audio Codec (SNAC) compresses audio into discrete codes at a low bitrate☆555Updated 5 months ago
- Run Orpheus 3B Locally With LM Studio☆367Updated last month
- Next-generation TTS model using flow-matching and DiT, inspired by Stable Diffusion 3☆403Updated 7 months ago
- 🐍 🤖 Pip installable package for StyleTTS 2 human-level text-to-speech and voice cloning☆158Updated 9 months ago
- VoiceRestore: Flow-Matching Transformers for Universal Speech Restoration☆160Updated this week
- Official implementation of the TTS model Lina-Speech☆163Updated 3 months ago
- Streaming and Finetuning code for CSM☆197Updated this week
- An espeak-compatible, permissively-licensed IPA phonemizer (G2P) based on DeepPhonemizer. Usable as a drop-in replacement for espeak's GP…☆95Updated 6 months ago
- VoiceStar: Robust, Duration-controllable TTS that can Extrapolate☆104Updated 2 weeks ago
- Verbatim Automatic Speech Recognition with improved word-level timestamps and filler detection☆677Updated 4 months ago
- Kyutai with an "eye"☆186Updated 3 weeks ago
- Official Implementation of StyleTTS☆430Updated 3 months ago
- Efficient approach to speaker diarization using voice characteristics extraction☆91Updated last year