google / transliterationLinks
Transliteration data and models
☆56Updated 9 years ago
Alternatives and similar repositories for transliteration
Users that are interested in transliteration are comparing it to the libraries listed below
Sorting:
- Thot toolkit for statistical machine translation☆53Updated 3 years ago
- Morfessor is a tool for unsupervised and semi-supervised morphological segmentation☆200Updated 5 years ago
- Text and Punctuation correction with Deep Learning☆128Updated 5 years ago
- Crawler for linguistic corpora☆213Updated 5 months ago
- Automatic transliteration with LSTM☆92Updated 7 years ago
- Transliteration module for Indian Languages☆79Updated 3 months ago
- A code for transliterating (romanizing) Arabic text using the American Library Association - Library of Congress (ALA-LC) standard☆49Updated 3 years ago
- Distributed infrastructure for Machine Translation web services (using Moses, Python, JSON-RPC/web interface)☆34Updated 4 years ago
- Fast supervised sentence boundary detection using the averaged perceptron☆91Updated 7 years ago
- Deep-learning based sentence auto-segmentation from unstructured text w/o punctuation☆36Updated 8 years ago
- SymSpellCompound: compound aware automatic spelling correction☆65Updated 7 years ago
- NLTK Contrib☆169Updated last year
- General-Purpose Neural Networks for Sentence Boundary Detection☆73Updated 2 years ago
- A fast and accurate POS and morphological tagging toolkit (EACL 2014)☆149Updated 5 years ago
- Fast approximate strings search & spelling correction☆60Updated 4 years ago
- ☆26Updated 3 years ago
- Corpus preprocessing☆99Updated last year
- Microsoft Speech Language Translation (MSLT) Corpus☆19Updated 8 years ago
- Colibri core is an NLP tool as well as a C++ and Python library for working with basic linguistic constructions such as n-grams and skipg…☆129Updated this week
- Indian Language Tagger and Chunker (Hindi, Telugu, Tamil, Marathi, Punjabi, Kanada, Malayalam, Urdu, Bengali)☆42Updated 3 years ago
- An LSTM RNN for restoring missing punctuation in unsegmented text.☆78Updated 9 years ago
- Resources to go with the Indic NLP Library☆77Updated 3 years ago
- A fast, simple, multilingual tokenizer☆29Updated 8 years ago
- Comparable documents miner: Arabic-English morphological analysis, text processing, n-gram features extraction, POS tagging, dictionary t…☆35Updated 8 years ago
- German Morphological Analyzer☆51Updated 4 years ago
- Cython wrapper on Hunspell Dictionary☆66Updated last year
- Wiktionary parser tool for many language editions.☆54Updated 3 years ago
- MIT Language Modeling Toolkit☆118Updated 6 years ago
- Unicode tokeniser. Ucto tokenizes text files: it separates words from punctuation, and splits sentences. It offers several other basic pr…☆70Updated last month
- a pytorch implementation of auto-punctuation learned character by character☆141Updated 5 years ago