CLD2Owners / cld2
Compact Language Detector 2
☆846Updated 3 years ago
Alternatives and similar repositories for cld2:
Users that are interested in cld2 are comparing it to the libraries listed below
- Automatically exported from code.google.com/p/chromium-compact-language-detector☆160Updated 4 years ago
- ☆807Updated last year
- Heuristic based boilerplate removal tool☆744Updated 8 months ago
- Language Detection with Infinity-gram☆231Updated 9 years ago
- Python stemming library using snowball stemmers☆248Updated 3 months ago
- Snowball compiler and stemming algorithms☆771Updated last week
- English word segmentation, written in pure-Python, and based on a trillion-word corpus.☆368Updated 2 years ago
- Multilingual word vectors in 78 languages☆1,195Updated last year
- 🦆 Contextually-keyed word vectors☆1,634Updated 10 months ago
- Simple, fast unsupervised word aligner☆743Updated 2 years ago
- UDPipe: Trainable pipeline for tokenizing, tagging, lemmatizing and parsing Universal Treebanks and other CoNLL-U files☆372Updated 2 months ago
- Bitextor generates translation memories from multilingual websites☆293Updated 2 months ago
- Multilingual text (NLP) processing toolkit☆2,319Updated last year
- Machine-readable lists of lemma-token pairs in 23 languages.☆335Updated 3 years ago
- Python Implementations of Word Sense Disambiguation (WSD) Technologies.☆747Updated 2 years ago
- Pre-trained subword embeddings in 275 languages, based on Byte-Pair Encoding (BPE)☆1,193Updated 3 months ago
- ☆167Updated 7 months ago
- This is a language detection library implemented in plain Java. (aliases: language identification, language guessing)☆737Updated 5 years ago
- Simhash and near-duplicate detection☆413Updated last year
- CRF++: Yet Another CRF toolkit☆507Updated 3 years ago
- Generating Vectors for DBpedia Entities via Word2Vec and Wikipedia Dumps. Questions? https://gitter.im/idio-opensource/Lobby☆600Updated 7 years ago
- Neural Adaptive Machine Translation that adapts to context and learns from corrections.☆343Updated 2 years ago
- Deep learning models trained to correct input errors in short, message-like text☆1,231Updated 5 years ago
- Open-Source Neural Machine Translation in Tensorflow☆796Updated 2 years ago
- This is a mirror of the script by Giuseppe Attardi, and contains history before the official repo started: https://github.com/attardi/wik…☆258Updated 8 years ago
- (Official repo for pypi package) Python bindings for the Hunspell spellchecker engine☆186Updated 3 years ago
- All languages stopwords collection☆427Updated last year
- Python interface to Boilerpipe, Boilerplate Removal and Fulltext Extraction from HTML pages☆542Updated 3 years ago
- Python implementation of TextRank algorithm for automatic keyword extraction and summarization using Levenshtein distance as relation bet…☆773Updated 2 years ago
- CMU ARK Twitter Part-of-Speech Tagger☆575Updated last year