guest271314 / SSMLParserLinks
Implement SSML parsing for Web Speech API
☆40Updated 5 years ago
Alternatives and similar repositories for SSMLParser
Users that are interested in SSMLParser are comparing it to the libraries listed below
Sorting:
- Putting flows on top of neural transducers for better TTS☆65Updated 3 weeks ago
- Labeled data for homograph disambiguation☆62Updated 2 years ago
- A simple voice conversion tool☆19Updated 3 years ago
- Lyra V2 (SoundStream) running in the browser☆19Updated 2 years ago
- The EveryVoice TTS Toolkit - Text To Speech for your language☆41Updated this week
- Grapheme-to-Phoneme transductions that preserve input and output indices, and support cross-lingual g2p!☆189Updated 2 weeks ago
- [Last Updated 2021] TTS from Cookie. Messy and experimental!☆43Updated 2 years ago
- An even smaller speech recognizer / force aligner☆37Updated last year
- A converter from Arpabet to IPA (see https://en.wikipedia.org/wiki/Arpabet)☆17Updated 8 years ago
- Zero-shot multimodal punctuation insertion and truecasing using Whisper☆119Updated 3 years ago
- Trying to build an all in one speech-text language model - a bit like GPT-4o☆22Updated last year
- flask+tornado based NVIDIA tacotron2+waveglow tts web app☆28Updated 2 years ago
- 🐸TTS recipes for different datasets☆86Updated 3 years ago
- ☆44Updated last year
- An espeak-compatible, permissively-licensed IPA phonemizer (G2P) based on DeepPhonemizer. Usable as a drop-in replacement for espeak's GP…☆106Updated last year
- Heteronym to Phoneme Parser☆19Updated 2 years ago
- PnG BERT: Augmented BERT on Phonemes and Graphemes for Neural TTS☆24Updated 4 years ago
- StyleTTS 2 Optimized Training Fork☆33Updated last year
- KATube is a tool to automate the process of creating datasets for training Text-To-Speech (TTS) and Speech-To-Text (STT) models. From a l…☆25Updated last year
- Google's SoundStorm: Efficient Parallel Audio Generation☆131Updated 2 years ago
- A high-quality, varied ~30hr voice dataset suitable for training a TTS model☆63Updated 3 years ago
- A bidirectional recurrent neural network model with attention mechanism for restoring missing punctuation in unsegmented text☆34Updated 5 years ago
- Creates video from TTS output and viseme images.☆16Updated 3 years ago
- Finally, some decent sample sentences☆23Updated 2 years ago
- Zero-Shot Emotion Style Transfer☆49Updated 9 months ago
- pytorch model for contexless-phoneme prediction from speech audio☆30Updated 3 months ago
- Multi-speaker Speech Synthesis Using VITS(KO, JA, EN, ZH)☆76Updated last year
- Timething is a library for aligning text transcripts with their audio recordings.☆128Updated last year
- Monotonic Alignment Search☆100Updated 8 months ago
- 🌼 Daisy-TTS: Simulating Wider Spectrum of Emotions via Prosody Embedding Decomposition☆14Updated 2 months ago