jmhodges / minhashLinks
An implementation of the MinHash algorithm in ruby using Murmur Hash
☆25Updated 16 years ago
Alternatives and similar repositories for minhash
Users that are interested in minhash are comparing it to the libraries listed below
Sorting:
- trying shingling / resemblance / simhash / sketching to do some data deduping☆97Updated 10 years ago
- A high performance distributed graph database.☆131Updated 6 years ago
- C++ utility library☆24Updated 12 years ago
- An almost deterministic top k elements counter Redis module☆35Updated 6 years ago
- An open source information retrieval system written in C++11 and Python. Aspires to be an alternative to Nutch / Lucene. It uses MongoDB …☆87Updated 2 years ago
- Bayesian classifier on top of Redis☆62Updated 13 years ago
- Pretty fast parser for probabilistic context free grammars☆88Updated 12 years ago
- A very memory-efficient trie (radix tree) implementation☆47Updated 13 years ago
- syslog module for nginx☆18Updated 15 years ago
- A minimalist realtime full-text search index☆153Updated 13 years ago
- Redis bulk-loader for Apache Pig☆40Updated 13 years ago
- ☆43Updated 12 years ago
- TweeQL is a Query Language for Tweets: SELECT brand(text) AS brand, sentiment(text) AS sentiment FROM twitter_sample;☆193Updated 11 years ago
- Realtime Analytics☆69Updated 12 years ago
- A JRuby DSL for Cascading☆41Updated 10 years ago
- A repository of non-native, useful redis commands, scripted in lua.☆61Updated 14 years ago
- ☆44Updated 3 years ago
- Jeremy's Machine Learning Library☆52Updated 9 years ago
- hamming distance fn for postgresql☆22Updated 14 years ago
- A distributed bloom filter implementation based on redis☆40Updated 7 years ago
- Fast IO buffering☆60Updated 13 years ago
- Tiny data structures that pack a punch!☆101Updated 13 years ago
- Hybrid Relational-Database/NOSQL-Datastore☆183Updated 13 years ago
- Ruby bindings to Neo4j Spatial☆26Updated 14 years ago
- Ruby client library for controlling Google Refine☆44Updated 7 years ago
- Mneme is an HTTP web-service for recording and identifying previously seen records - aka, duplicate detection.☆108Updated 12 years ago
- My original graph database DSL machine☆176Updated 4 years ago
- News Aggregator that classifies and clusterifies news from different sources☆46Updated 14 years ago
- Round robin database pattern via Redis sorted sets☆79Updated 15 years ago
- An implementation of the HyperLogLog algorithm backed by Redis☆170Updated 10 years ago