hexgrad / misaki
G2P
☆35Updated this week
Alternatives and similar repositories for misaki:
Users that are interested in misaki are comparing it to the libraries listed below
- StyleTTS 2 Optimized Training Fork☆18Updated this week
- Misc. tools/scripts that I made to use for tortoise☆21Updated 5 months ago
- Real-time end-to-end singing voice convertion☆19Updated 2 months ago
- Provide Gradio custom components to make the diarization-based audio labeling process easier and faster.☆53Updated last week
- Trying to build an all in one speech-text language model - a bit like GPT-4o☆22Updated 7 months ago
- Convert your PDFs into audiobooks effortlessly. Features intelligent text extraction, customizable text-to-speech settings, and efficient…☆37Updated 2 weeks ago
- ☆62Updated 6 months ago
- An open source implementation of Microsoft's VALL-E X zero-shot TTS model. Demo is available in https://plachtaa.github.io☆16Updated 9 months ago
- VoiceRestore: Flow-Matching Transformers for Universal Speech Restoration☆114Updated last week
- This is a repository that collects common audio noise reduction models, using Gradio to demonstrate the use of each model, which is very …☆31Updated last month
- An espeak-compatible, permissively-licensed IPA phonemizer (G2P) based on DeepPhonemizer. Usable as a drop-in replacement for espeak's GP…☆90Updated 3 months ago
- ☆24Updated 7 months ago
- [NCMMSC'2024] Emotion-Aware Prosodic Phrasing for Expressive Text-to-Speech☆22Updated 5 months ago
- zero-shot realtime TTS system, fully offline, free and open source☆24Updated 2 weeks ago
- Audio tokenization, in the fastest way possible!☆46Updated 5 months ago
- ☆33Updated 2 months ago
- Codec for paper: LLaSA: Scaling Train-time and Test-time Compute for LLaMA-based Speech Synthesis☆126Updated 2 weeks ago
- A lightweight Python library for running TTS models with a unified API.☆16Updated 2 weeks ago
- ☆28Updated last year
- My vocoder experiments☆26Updated 3 months ago
- Joint speech-language model - respond directly to audio!☆30Updated 8 months ago
- This repository contains the code and data for the paper EmoKnob: Enhance Voice Cloning with Fine-Grained Emotion Control by Haozhe Chen,…☆65Updated 3 months ago
- Babylon.cpp is a C and C++ library for grapheme to phoneme conversion and text to speech synthesis. For phonemization a ONNX runtime port…☆16Updated 5 months ago
- Pytorch implementation of SoundCTM☆76Updated last month
- The demo page of UniAudio☆34Updated 11 months ago
- [ICASSP 2025] FreeSVC: Towards Zero-shot Multilingual Singing Voice Conversion☆39Updated 2 weeks ago
- VoiceBox neural network implementation☆100Updated 5 months ago
- GPT-style network for phonemization with durations of text☆64Updated 10 months ago
- Official Code for ParrotTTS☆46Updated 3 months ago