gurgeous / simhilarityLinks
Measure text similarity using weighted ngrams.
☆18Updated 11 years ago
Alternatives and similar repositories for simhilarity
Users that are interested in simhilarity are comparing it to the libraries listed below
Sorting:
- Ruby implementation of the PageRank and TextRank algorithms.☆75Updated 6 months ago
- A ruby/c extension to Christian Borgelt's apriori item-set implementation☆55Updated 15 years ago
- Pragmatic Segmenter is a rule-based sentence boundary detection gem that works out-of-the-box across many languages.☆577Updated last year
- Launch AWS Elastic MapReduce jobs that process Common Crawl data.☆49Updated 8 years ago
- A scalable and shareable repository of text annotation☆33Updated last week
- Compare image similarity with a dhash☆91Updated 2 years ago
- A document vector search with flexible matrix transforms. Currently supports Latent semantic analysis and Term frequency - inverse docume…☆149Updated 5 years ago
- Fuzzy document finding in Ruby☆23Updated last week
- Wikidata and Wikipedia API client.☆35Updated 2 years ago
- Wikipedia information extraction library☆176Updated last year
- Simple Ruby client for Wikidata☆35Updated last year
- Ruby Binding for Stanford Pos-Tagger and Name Entity Recognizer☆92Updated 11 years ago
- Lemmatizer for text in English. Inspired by Python's nltk.corpus.reader.wordnet.morphy☆112Updated 4 years ago
- Expose libstemmer_c to Ruby☆250Updated 3 years ago
- A pure Ruby implementation of the Aho-Corasick string matching algorithm☆34Updated 9 years ago
- Ruby port of UEALite Stemmer - a conservative stemmer for search and indexing☆54Updated last month
- Simple Naive Bayes classifier☆49Updated 13 years ago
- Ruby wrapper for correcting spelling and grammar mistakes based on the context of complete sentences.☆477Updated 6 years ago
- A Ruby interface to the WordNet® Lexical Database.☆139Updated 2 years ago
- A JRuby command line application and library for Apache Tika to extract text and metadata from files of various formats.☆54Updated 6 months ago
- A Ruby wrapper for Latent Dirichlet Allocation (LDA).☆134Updated 5 years ago
- annoy-rb provides Ruby bindings for the Annoy (Approximate Nearest Neighbors Oh Yeah).☆36Updated this week
- Implementation of the Rapid Automatic Keyword Extraction algorithm in Ruby, a multi-word keywords extraction.☆37Updated 11 years ago
- Wicked fast Conditional Random Fields for Ruby☆37Updated 2 years ago
- A pure Ruby interface to the WordNet database☆91Updated 6 years ago
- Polipus: distributed and scalable web-crawler framework☆92Updated 10 years ago
- Machine learning and data mining algorithms for JRuby☆92Updated 8 years ago
- A library for generating fake data such as names, addresses and much more.☆12Updated 6 years ago
- Web crawler with very flexible crawling options. Can either use standalone or can be used with resque to perform clustered crawls.☆225Updated 2 years ago
- Named entity recognition with Stanford NER and Ruby☆20Updated 2 years ago