Links to data used in Sproat & Jaitly (https://arxiv.org/abs/1611.00068) experiments.
☆77Jul 9, 2021Updated 4 years ago
Alternatives and similar repositories for text-normalization-data
Users that are interested in text-normalization-data are comparing it to the libraries listed below
Sorting:
- RNNs for Text Normalization☆40Dec 12, 2017Updated 8 years ago
- This repository contains data used in the NAACL 2021 Paper - Proteno: Text Normalization with Limited Data for Fast Deployment in Text to…☆45May 25, 2021Updated 4 years ago
- ☆213Jun 16, 2018Updated 7 years ago
- Covering grammars for English and Russian text normalization☆61Sep 15, 2019Updated 6 years ago
- A bunch of scripts exploiting several tools to perform inverse text normalization (ITN)☆21Sep 27, 2017Updated 8 years ago
- Convert words to numbers☆21Apr 13, 2022Updated 3 years ago
- Various scripts that facilitate the preparation of Automatic Speech Recognition related resources☆17Apr 16, 2020Updated 5 years ago
- This is an extension of kaldi speech recognition software which allows to perform decoding of speech with hybrid word and phoneme graphs.…☆11Feb 4, 2020Updated 6 years ago
- BurrMill core☆22Nov 2, 2021Updated 4 years ago
- CS224S Course Project☆14Jun 9, 2014Updated 11 years ago
- Speech waveform synthesis filters☆13Jul 21, 2017Updated 8 years ago
- A simple tutorial on setting up Sparrowhawk - a text-to-speech normalization engine☆14Oct 16, 2017Updated 8 years ago
- Multilingual Grapheme to Phoneme☆51Feb 23, 2016Updated 10 years ago
- Python tool for normilizing text and text canonicalization (DISCONTINUED)☆41Sep 3, 2013Updated 12 years ago
- ChiNese Text Normalization (CNTN) tool for Text-to-speech system☆37Apr 12, 2018Updated 7 years ago
- ☆26Apr 21, 2021Updated 4 years ago
- INACTIVE - http://mzl.la/ghe-archive - Tools to create ARPA models from cmu pocketsphinx dictionaries for proper g2p generation☆21Mar 29, 2019Updated 6 years ago
- Multiobjective Optimization Training of PLDA for Speaker Verification☆10Jun 14, 2018Updated 7 years ago
- Scripts for recreating the Replication Dataset for Fundamental Frequency Estimation. Part of the dissertation "Pitch of Voiced Speech in …☆11Mar 29, 2021Updated 4 years ago
- ☆10Mar 20, 2021Updated 4 years ago
- Estonian text-to-speech text normalization pipeline☆12Dec 17, 2025Updated 2 months ago
- Gentle and praatio scripts for easy forced alignment☆18Oct 27, 2022Updated 3 years ago
- NPTEL2020: Speech2Text dataset for Indian-English Accent☆81Dec 24, 2021Updated 4 years ago
- Implementation of different noise embeddings for noise aware training of Kaldi acoustic models.☆13Feb 13, 2021Updated 5 years ago
- NNSVS向けの教師データのラベル作成支援ツールです。☆10Apr 5, 2023Updated 2 years ago
- Perform the forced decoding with target transcription☆11Sep 12, 2018Updated 7 years ago
- Crawling engine that crawls a set of top-level domains looking for documents in a list of languages☆11Feb 6, 2024Updated 2 years ago
- This is a github repository of the abandonware Sequitur G2P by Bisani & Ney☆175Dec 16, 2025Updated 2 months ago
- ☆47May 22, 2017Updated 8 years ago
- Bayesian spEEch Recognizer☆55Jan 11, 2021Updated 5 years ago
- Text normalization scripts from IRISA lab☆14Jun 1, 2018Updated 7 years ago
- Humphrey, E. J. "An Exploration of Deep Learning in Music Informatics." (2015) New York University.☆14Feb 23, 2016Updated 10 years ago
- PAVOQUE Corpus of Expressive Speech☆12Aug 2, 2016Updated 9 years ago
- Text-to-Speech tutorial at SLTU 2016☆35May 10, 2016Updated 9 years ago
- Phonetically-Oriented Word Error Rate☆36May 4, 2019Updated 6 years ago
- Unicode Standard tokenization routines and orthography profile segmentation☆39Feb 20, 2025Updated last year
- An efficient OpenFST-based tool for calculating WER and aligning two transcript sequences.☆170Jan 7, 2026Updated last month
- Helsinki Prosody Corpus and A System for Predicting Prosodic Prominence from Text☆247Oct 30, 2019Updated 6 years ago
- Python wrapper for kaldi's arpa2fst☆38Aug 27, 2025Updated 6 months ago