Convert English text from written expressions into spoken forms
☆28Jun 22, 2022Updated 3 years ago
Alternatives and similar repositories for TTSTextNormalization
Users that are interested in TTSTextNormalization are comparing it to the libraries listed below
Sorting:
- This is the source code of the paper "Neural grapheme-to-phoneme conversion with pretrained grapheme models☆48Mar 25, 2022Updated 3 years ago
- Google's SoundStorm: Efficient Parallel Audio Generation☆131Aug 8, 2023Updated 2 years ago
- ☆55Jan 13, 2023Updated 3 years ago
- Labeled data for homograph disambiguation☆62Jun 1, 2023Updated 2 years ago
- Phoneme alignment representation compatible with multiple forced aligners☆22Apr 12, 2024Updated last year
- Speaker-aware CTC (SACTC) for multi-talker overlapped speech recognition.☆21May 26, 2025Updated 9 months ago
- ☆23Oct 17, 2024Updated last year
- KittenTTS is an ultra-lightweight, CPU-friendly text-to-speech model with 15M params for real-time, high-quality voices. Open source, fas…☆23Updated this week
- Scripts for computing the Intelligibility and CLVP scores for evaluating TTS models☆175Dec 18, 2023Updated 2 years ago
- MOS score prediction by fine-tuned wav2vec2.0 model☆175Oct 20, 2022Updated 3 years ago
- My hybrid TTS network that combines, VALL-E, VoiceBox, SpeechFlow, Seamless and TortoiseTTS into one☆26Aug 5, 2024Updated last year
- Reimplementation of Miipher☆29Aug 16, 2023Updated 2 years ago
- Implementation for paper "Disentangled Speech Representation Learning for One-Shot Cross-Lingual Voice Conversion Using ß-VAE"☆44Apr 10, 2023Updated 2 years ago
- ☆68Jul 16, 2023Updated 2 years ago
- ☆26Jun 5, 2024Updated last year
- One command to start a streaming ASR server.☆12Oct 2, 2024Updated last year
- ☆10Sep 2, 2024Updated last year
- Project for HIDING SPEAKER’S SEX IN SPEECH USING ZERO-EVIDENCE SPEAKER REPRESENTATION IN AN ANALYSIS/SYNTHESIS PIPELINE☆15Nov 30, 2022Updated 3 years ago
- 💬📝 A small dictation app using OpenAI's Whisper speech recognition model.☆11Sep 13, 2024Updated last year
- T5Voice is a lightweight PyTorch implementation of T5-based text-to-speech synthesis, supporting both streaming and non-streaming speech …☆28Nov 7, 2025Updated 3 months ago
- ☆28Nov 15, 2023Updated 2 years ago
- Incorporating AutoVocoder to MB-iSTFT-VITS☆48Dec 1, 2022Updated 3 years ago
- ☆69May 19, 2023Updated 2 years ago
- TTS FrontEnd DataSet: Polyphone / Prosody / TextNormalization☆103Feb 5, 2024Updated 2 years ago
- TTSAudioNormalizer is a specialized tool for TTS data production, featuring descriptive statistical analysis of audio loudness and loud…☆111Dec 20, 2024Updated last year
- ☆26Aug 8, 2024Updated last year
- A collection of utilities for handling IPA phones.☆26Sep 24, 2023Updated 2 years ago
- Collection of scripts from mHuBERT-147.☆32Nov 19, 2024Updated last year
- CML-TTS: A Multilingual Dataset for Speech Synthesis☆33Jul 31, 2024Updated last year
- ☆140Jan 7, 2024Updated 2 years ago
- Phoneme-Level BERT for Enhanced Prosody of Text-to-Speech with Grapheme Predictions☆268Jan 13, 2025Updated last year
- ☆14Aug 16, 2023Updated 2 years ago
- ☆11May 7, 2022Updated 3 years ago
- Openfst mirror with some fixes☆14Aug 23, 2024Updated last year
- Official PyTorch implementation of (ICME2025 oral) "AutoStyle-TTS: Retrieval-Augmented Generation based Automatic Style Matching Text-to-…☆16Feb 1, 2026Updated last month
- Using OpenVINO to speed up MeloTTS inference☆15Nov 1, 2024Updated last year
- A Benchmark Corpus for Low-Resource Cantonese Punctuation Restoration from Speech Transcripts☆16Dec 3, 2024Updated last year
- ☆10Sep 19, 2022Updated 3 years ago
- ☆23Dec 6, 2025Updated 2 months ago