NeuralVox / OpenPhonemizer
An espeak-compatible, permissively-licensed IPA phonemizer (G2P) based on DeepPhonemizer. Usable as a drop-in replacement for espeak's GPL phonemizer.
☆83Updated last month
Related projects ⓘ
Alternatives and complementary repositories for OpenPhonemizer
- [IJCAI'23] Learning to Speak from Text for Low-Resource TTS☆64Updated last year
- Deep Neural Pitch Extractor for Voice Conversion and TTS Training☆119Updated 2 years ago
- VALL-E 2 reproduction☆83Updated 3 months ago
- ZMM-TTS: Zero-shot Multilingual and Multispeaker Speech Synthesis Conditioned on Self-supervised Discrete Speech Representations☆121Updated 8 months ago
- [ICASSP 2024] StoryTTS: A Highly Expressive Text-to-Speech Dataset with Rich Textual Expressiveness Annotations☆135Updated 6 months ago
- FlashSpeech: Efficient Zero-Shot Speech Synthesis☆93Updated last month
- LibriTTS-P: A Corpus with Speaking Style and Speaker Identity Prompts for Text-to-Speech and Style Captioning☆114Updated 4 months ago
- VoiceBox neural network implementation☆96Updated 3 months ago
- Style-Controllable Zero-Shot Text to Speech Synthesizer based on VALL-E☆135Updated 2 weeks ago
- ☆32Updated last month
- Finetuning VITS Efficiently☆32Updated last year
- Monotonic Alignment Search☆86Updated 2 years ago
- ☆57Updated 2 months ago
- Joint CTC-S2S Phoneme-level ASR for Voice Conversion and TTS (Text-Mel Alignment)☆112Updated 2 years ago
- X-E-Speech: Joint Training Framework of Non-Autoregressive Cross-lingual Emotional Text-to-Speech and Voice Conversion☆65Updated 7 months ago
- Official implementation of Vec-Tok Speech☆93Updated last year
- All generative model in one for better TTS model☆66Updated 2 months ago
- ☆123Updated last month
- ☆76Updated 2 months ago
- Libriheavy: a 50,000 hours ASR corpus with punctuation casing and context☆179Updated 2 months ago
- Transcribing Speech with Multinomial Diffusion, training code and models.☆75Updated last year
- Contains the code associated with the ICLR submission for our text-to-speech diffusion model☆51Updated last year
- ☆70Updated last year
- Implementation of BEST-RQ - a model for self-supervised learning of speech signals using a random projection quantizer, in Pytorch.☆93Updated last year
- A sequence-to-sequence voice conversion toolkit.☆85Updated 4 months ago
- Implementation of SoundStorm built upon SpeechTokenizer.☆103Updated last year
- SSR-Speech: Towards Stable, Safe and Robust Zero-shot Speech Editing and Synthesis☆90Updated last week
- The official Implementation of PeriodWave and PeriodWave-Turbo☆128Updated 2 months ago
- An unofficial PyTorch implementation of VALL-E☆75Updated this week
- An automatic prosodic boundary annotation tool for Text-to-Speech Synthesis (TTS).☆48Updated 4 months ago