viswavi / languageid
Identifying the language of input text using character-level n-grams, with support for 45 languages
☆11Updated last year
Related projects ⓘ
Alternatives and complementary repositories for languageid
- Training an n-gram based Language Model using KenLM toolkit for Deep Speech 2☆112Updated 5 years ago
- Tensorflow with KenLM integrated for beam search scoring☆34Updated 7 years ago
- An LSTM RNN for restoring missing punctuation in unsegmented text.☆79Updated 8 years ago
- Tensor2tensor experiment with SpecAugment☆47Updated 5 years ago
- A simple implementation of the paper https://arxiv.org/pdf/1910.00716v1.pdf☆31Updated 2 years ago
- Auto Segmentation Criterion (ASG) implemented in pytorch☆50Updated 3 years ago
- Speech2vec pre-trained word vectors☆77Updated 6 years ago
- ☆45Updated 5 years ago
- An implementation of Tacotron and Tacotron2☆81Updated 3 years ago
- A collection of scripts to preprocess ASR datasets and finetune language-specific Wav2Vec2 XLSR models☆31Updated 3 years ago
- Demonstration of the results in "Text Normalization using Memory Augmented Neural Networks", Authors: Subhojeet Pramanik, Aman Hussain☆60Updated 5 years ago
- SIGMORPHON 2020 Shared Task: Grapheme-to-Phoneme, Unsupervised Induction of Morphology, and Typologically Diverse Morphological Inflectio…☆35Updated 3 years ago
- RNNs for Text Normalization☆38Updated 6 years ago
- The repository for Speech Recognition Israel meetup group. It is used to material collection and sharing.☆13Updated 4 years ago
- Jupyter Notebooks for creating Speech datasets☆46Updated 5 years ago
- This repository contains data used in the NAACL 2021 Paper - Proteno: Text Normalization with Limited Data for Fast Deployment in Text to…☆42Updated 3 years ago
- Tool for creation, manipulation and maintenance of voice corpora☆81Updated 6 months ago
- Python API for reading and querying ARPA formatted language models.☆33Updated 10 years ago
- Fast parallel CTC.☆31Updated 6 years ago
- PyTorch CTC Decoder bindings☆14Updated 7 years ago
- PyTorch bindings for Warp-CTC☆42Updated 4 years ago
- LSTM Language Model with Subword Units Input Representations☆43Updated 3 years ago
- ☆74Updated 3 years ago
- PyTorch CTC Decoder bindings☆42Updated 6 years ago
- A Neural Machine Translation toolkit for research purpose☆82Updated 2 weeks ago
- ☆48Updated 2 years ago
- MaSS - Multilingual corpus of Sentence-aligned Spoken utterances☆48Updated last month
- An extremely simple Python wrapper for the SRI Language Modeling toolkit☆70Updated 10 years ago
- All you need to get started for the Zero Speech Challenge 2017☆46Updated 5 years ago