shuyo / language-detection
This is a language detection library implemented in plain Java. (aliases: language identification, language guessing)
☆733Updated 5 years ago
Related projects ⓘ
Alternatives and complementary repositories for language-detection
- Language Detection Library for Java☆569Updated 2 years ago
- Compact Language Detector 2☆844Updated 3 years ago
- Apache OpenNLP☆1,447Updated this week
- MALLET is a Java-based package for statistical natural language processing, document classification, clustering, topic modeling, informat…☆989Updated 8 months ago
- Language Detection with Infinity-gram☆231Updated 9 years ago
- Java interface for fastText☆229Updated last year
- Automatically exported from code.google.com/p/chromium-compact-language-detector☆160Updated 4 years ago
- ☆797Updated last year
- Heuristic based boilerplate removal tool☆729Updated 6 months ago
- Word2Vec Java Port☆186Updated 6 years ago
- DBpedia Spotlight is a tool for automatically annotating mentions of DBpedia resources in text.☆756Updated 6 years ago
- TextTeaser is an automatic summarization algorithm.☆1,970Updated 6 years ago
- The most accurate natural language detection library for Java and the JVM, suitable for long and short text alike☆710Updated this week
- A large-scale statistical machine translation system written in Java.☆208Updated 2 years ago
- Work in progress transmit from Google Code☆1,109Updated 6 years ago
- Simhash and near-duplicate detection☆410Updated last year
- extJWNL (Extended Java WordNet Library) is a Java API for creating, reading and updating dictionaries in WordNet format.☆126Updated 8 months ago
- ☆184Updated 5 years ago
- Stand-alone language identification system☆2,324Updated 4 years ago
- Generating Vectors for DBpedia Entities via Word2Vec and Wikipedia Dumps. Questions? https://gitter.im/idio-opensource/Lobby☆601Updated 6 years ago
- Official version of TextTeaser.☆621Updated 6 years ago
- Java implementation of the Aho-Corasick algorithm for efficient string matching☆898Updated 6 months ago
- Just the facts -- web page content extraction☆1,254Updated 4 months ago
- A fast and accurate POS and morphological tagging toolkit (EACL 2014)☆140Updated 4 years ago
- Python interface to Boilerpipe, Boilerplate Removal and Fulltext Extraction from HTML pages☆539Updated 3 years ago
- Port of Google's language-detection library to Python.☆1,729Updated 9 months ago
- CogComp's Natural Language Processing Libraries and Demos: Modules include lemmatizer, ner, pos, prep-srl, quantifier, question type, rel…☆473Updated last year
- Approximate nearest neighbors in Java☆138Updated 4 years ago
- Quality information extraction at web scale. Edit☆327Updated 7 years ago
- Apache Joshua☆104Updated 4 years ago