saffsd / langid.js
An off-the-shelf client-side language identification module for JavaScript.
☆14Updated 10 years ago
Related projects: ⓘ
- Thot toolkit for statistical machine translation☆50Updated last year
- Unicode tokeniser. Ucto tokenizes text files: it separates words from punctuation, and splits sentences. It offers several other basic pr…☆65Updated last week
- Morfessor is a tool for unsupervised and semi-supervised morphological segmentation☆180Updated 3 years ago
- Machine translation for the real world☆23Updated 4 years ago
- An English to Hindi Dictionary☆25Updated 3 years ago
- Fast Word Segmentation with Triangular Matrix☆77Updated 2 years ago
- A multilingual lexical and semantic resource that links words of natural languages to abstract semantic concepts. Also called U++ Common …☆27Updated 3 years ago
- Distributed infrastructure for Machine Translation web services (using Moses, Python, JSON-RPC/web interface)☆33Updated 2 years ago
- NLTK Contrib☆166Updated 6 months ago
- simple crawler for some uyghur website such as uy.ts.cn,bbs.bagdax.cn,www.bagdax.cn(using python and scrapy)☆12Updated 3 years ago
- The source of the phonetic transcriptions is Oxford Advanced Learner's Dictionary (3rd ed.), available from the Oxford Text Archive (http…☆21Updated 7 years ago
- Fast approximate strings search & spelling correction☆57Updated 2 years ago
- Transliteration data and models☆53Updated 7 years ago
- Sentence aligner☆106Updated 3 years ago
- A tool for text normalisation via character-level machine translation☆13Updated 4 years ago
- Fast Word Clustering Software☆74Updated last month
- ANNIS is an open source, versatile web browser-based search and visualization architecture for complex multilevel linguistic corpora with…☆68Updated 3 weeks ago
- Wiktionary parser tool for many language editions.☆53Updated 2 years ago
- Inforex is a web system for text corpora construction.☆11Updated 8 months ago
- SMOR (Stuttgart Morphology) with alternative lemmatization component☆11Updated last year
- Master repo for the UniMorph project, includes the UniMorph schema and annotated data files☆27Updated 4 years ago
- A neural network that jointly part-of-speech tags and lemmatizes sentences, boosting accuracy for morphologically-rich languages (Czech, …☆34Updated 5 years ago
- Python Finite-State Toolkit☆39Updated last month
- ☆12Updated 8 years ago
- Framework for creating and accessing UBY resources – sense-linked lexical resources in standard UBY-LMF format☆22Updated 6 years ago
- A Corpus Data Retrieval Index using Lucene for Look-Ups☆16Updated last week
- OpenNeuroSpell contains parts of NeuroSpell (http://neurospell.com/en.php) released as open-source. More code will be published as soon a…☆20Updated 2 years ago
- Make N-Gram for Uyghur language☆14Updated 3 years ago
- German Morphological Analyzer☆45Updated 2 years ago
- Language data store and linguistic query API☆35Updated 3 weeks ago