optimaize / language-detector
Language Detection Library for Java
☆575Updated 2 years ago
Alternatives and similar repositories for language-detector:
Users that are interested in language-detector are comparing it to the libraries listed below
- This is a language detection library implemented in plain Java. (aliases: language identification, language guessing)☆748Updated 6 years ago
- A set of reusable Java components that implement functionality common to any web crawler☆243Updated last week
- extJWNL (Extended Java WordNet Library) is a Java API for creating, reading and updating dictionaries in WordNet format.☆128Updated last year
- Java implementation of the Aho-Corasick algorithm for efficient string matching☆912Updated 11 months ago
- Word2Vec Java Port☆186Updated 6 years ago
- Java API for GeoIP2 webservice client and database reader☆806Updated this week
- A bundle of html content extraction algorithms☆121Updated 9 years ago
- A Java library to detect and normalize URLs in text☆783Updated 2 years ago
- Automatically exported from code.google.com/p/chromium-compact-language-detector☆161Updated 4 years ago
- ☆184Updated 6 years ago
- Similarity or Distance Metrics, e.g. Levenshtein, for Java☆345Updated 3 years ago
- Compact Language Detector 2☆855Updated 3 years ago
- A language detection library for the JVM☆36Updated last year
- SitemapGen4j is a library to generate XML sitemaps in Java.☆165Updated 3 years ago
- An Elasticsearch ingest processor to do named entity extraction using Apache OpenNLP☆270Updated 2 years ago
- A plugin for language detection in Elasticsearch using Nakatani Shuyo's language detector☆252Updated 7 years ago
- Java library to convert short codes, emoticons, html entities, emoticons to emojis and vice-versa☆201Updated 2 years ago
- A simple implementation of simhash algorithm by java.☆155Updated 4 years ago
- This tool extracts word vectors from Lucene index.☆135Updated 7 years ago
- galimatias is a URL parsing and normalization library written in Java.☆161Updated last year
- Tools for finite state automata construction and dictionary-based morphological dictionaries. Includes Polish stemming dictionary.☆189Updated last year
- Java port of Arc90's Readability.js - parses HTML as input and returns clean, easy-to-read text☆171Updated 11 years ago
- When jsoup meets XPath.☆468Updated last year
- Java text categorization system☆55Updated 7 years ago
- The most accurate natural language detection library for Java and the JVM, suitable for long and short text alike☆739Updated this week
- jMimeMagic is a Java library for determining the MIME type of files or streams.☆207Updated 2 years ago
- Java porting of Darts (Double ARray Trie System)☆268Updated 6 years ago
- Pdf2Dom is a PDF parser that converts the documents to a HTML DOM representation. The obtained DOM tree may be then serialized to a HTM…☆181Updated 2 years ago
- Repackaging of Boilerpipe published on Maven Central Repository.☆53Updated last year
- A streaming JsonPath processor in Java☆296Updated 9 months ago