epeake / ModifiedKneserNeyLinks
Interpolated Kneser-Ney smoothing with an out-of-vocabulary correction and discount estimated from training data
☆12Updated 4 years ago
Alternatives and similar repositories for ModifiedKneserNey
Users that are interested in ModifiedKneserNey are comparing it to the libraries listed below
Sorting:
- A curated list of awesome disfluency detection publications along with the released code and bibliographical information☆77Updated 4 years ago
- Gamma Agreement in Python☆44Updated last year
- Multilingual grapheme-to-phoneme conversion☆20Updated 7 years ago
- An audio and transcribed corpus of contemporary Hong Kong Cantonese☆37Updated 4 years ago
- Spoken mandarin Chinese from Hong Kong.☆12Updated last week
- Python module for syllabifying English ARPABET transcriptions☆66Updated 6 years ago
- Automatically convert plain text into phonemes (US English pronunciation) and syllabify☆28Updated 7 years ago
- Automatic prosodic annotation tool written in Java.☆62Updated 6 years ago
- Switchboard Dialog Act Corpus with Penn Treebank links☆144Updated 4 years ago
- Text-to-Speech tutorial at SLTU 2016☆35Updated 9 years ago
- phone inventory library☆16Updated 2 years ago
- Speech2vec pre-trained word vectors☆76Updated 6 years ago
- Spoken Cantonese from Hong Kong.☆29Updated last week
- Python library for n-gram models in ARPA format☆40Updated 2 years ago
- Implementation of the Links Online Clustering algorithm: https://arxiv.org/abs/1801.10123☆26Updated 3 years ago
- ☆77Updated 2 years ago
- Unicode Standard tokenization routines and orthography profile segmentation☆37Updated 3 months ago
- Script to download corpora from the Linguistic Data Consortium (LDC)☆31Updated 10 months ago
- python code for converting among IPA, ARPABET, XSAMPA, Callhome, DISC, TIMIT, plus some lexical tones.☆35Updated last year
- Deep Learning systems for training and testing disfluency detection and related tasks on speech data.☆58Updated 6 years ago
- Covering grammars for English and Russian text normalization☆61Updated 5 years ago
- This repository contains data used in the NAACL 2021 Paper - Proteno: Text Normalization with Limited Data for Fast Deployment in Text to…☆45Updated 4 years ago
- SIGMORPHON 2020 Shared Task: Grapheme-to-Phoneme, Unsupervised Induction of Morphology, and Typologically Diverse Morphological Inflectio…☆36Updated last month
- Punctuation generation for speech transcripts using lexical and prosodic features☆41Updated 6 years ago
- Spoken Language Identification on Common Voice and AudioSet using Deep Learning☆40Updated 2 years ago
- SegEval Segmentation Evaluation Package☆56Updated last year
- ☆49Updated 3 years ago
- Pre-training Cross-modal Transformer for Audio-and-Language Representations☆39Updated 4 years ago
- This repo contains a set of neural transducer, e.g. sequence-to-sequence model, focusing on character-level tasks.☆75Updated last year
- Kaldi style neural network training in pytorch for use in place of nnet3 in Kaldi.☆26Updated 10 months ago