jdf / cue.language
A small Java library for simple text analysis - counting strings, identifying languages, and removing stop words.
☆156Updated 5 years ago
Alternatives and similar repositories for cue.language:
Users that are interested in cue.language are comparing it to the libraries listed below
- A small Java library for simple text analysis - counting strings, identifying languages, and removing stop words.☆59Updated 7 years ago
- SIREn - Semi-Structured Information Retrieval Engine☆107Updated 3 years ago
- [not maintained] Custom Twitter Search via ElasticSearch&Wicket☆61Updated 4 years ago
- Common Crawl support library to access 2008-2012 crawl archives (ARC files)☆493Updated 7 years ago
- Apache Pig utilities to build training corpora for machine learning / NLP out of public Wikipedia and DBpedia dumps.☆158Updated 2 years ago
- A fast and easy to use decision tree learner in java☆232Updated 2 years ago
- Behemoth is an open source platform for large scale document analysis based on Apache Hadoop.☆281Updated 6 years ago
- The Sweble Wikitext Components module provides a parser for MediaWiki's wikitext and an engine trying to emulate the behavior of a MediaW…☆71Updated 9 months ago
- A port of the arclabs 'readability' package to Java☆72Updated 12 years ago
- SQLite JDBC Driver☆159Updated 15 years ago
- GWT implementation of standard the node.js library☆87Updated 13 years ago
- A Lazy Data Flow Framework (no longer active - see Apache TinkerPop)☆277Updated 3 years ago
- ElasticSearch OSEM☆22Updated last year
- Java implementation of a probabilistic set data structure☆143Updated 7 years ago
- SimpleJPA - Java Persistence API (JPA) implementation for Amazon SimpleDB☆52Updated 4 years ago
- Natural language date parsing in Java, ported directly from Ruby's chronic☆126Updated 2 years ago
- Leaner version of jpropel, containing only LINQ, reified collections and utilities for arrays/strings/numerics/files/xml etc.☆125Updated 11 years ago
- Various utilities regarding Levenshtein transducers. (Java)☆57Updated 3 years ago
- Slushpile for handy fragments of script that don't fit anywhere else☆119Updated 11 years ago
- Graffiti is a micro web framework for Groovy inspired by Sinatra☆98Updated 10 years ago
- A Java library for authenticating HTTP Requests using OAuth☆217Updated last year
- Find the Git commits you're looking for☆118Updated last year
- Java framework for Google App Engine☆80Updated 4 years ago
- Bulk loading for elastic search☆185Updated last year
- Sitebricks: A fast platform for web development.☆248Updated 2 years ago
- WARC (Web Archive) Input and Output Formats for Hadoop☆35Updated 10 years ago
- Yoga is RESTful but flexible.☆157Updated last year