seomoz/simhash-cluster

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/seomoz/simhash-cluster)

seomoz / simhash-cluster

A cluster implementation of simhash near-duplicate detection

☆32

Alternatives and similar repositories for simhash-cluster

Users that are interested in simhash-cluster are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

seomoz / g-crawl-py
View on GitHub
Gevent Crawling in Python, with Utilities
☆22Mar 12, 2015Updated 11 years ago
seomoz / simhash-db-py
View on GitHub
Python API for Various DB-Backed Simhash Clusters
☆64Mar 16, 2017Updated 9 years ago
dgleich / libbvg
View on GitHub
A C implementation of a Boldi-Vigna graph decompressor
☆17Jul 5, 2016Updated 10 years ago
kxtells / vague-places
View on GitHub
☆14Dec 24, 2016Updated 9 years ago
cltl / KafNafParserPy
View on GitHub
Parser for KAF NAF files written in Python
☆16Jul 1, 2021Updated 5 years ago
Open source password manager - Proton Pass • Ad
Securely store, share, and autofill your credentials with Proton Pass, the end-to-end encrypted password manager trusted by millions.
flike / golog
View on GitHub
simple log library for Golang
☆14Jun 25, 2015Updated 11 years ago
jgrahamc / wtickle
View on GitHub
Small program to run requests against a web server and look for problems
☆11Jan 20, 2016Updated 10 years ago
errx / go-asap
View on GitHub
ASAP smoothing
☆13Sep 8, 2017Updated 8 years ago
tomafro / rails-activerecord-columnreader
View on GitHub
A simple column reader for ActiveRecord
☆13Nov 1, 2011Updated 14 years ago
bmuller / pymur
View on GitHub
pymur is a Python interface to The Lemur Toolkit.
☆19Sep 17, 2018Updated 7 years ago
seiflotfy / superminhash
View on GitHub
SuperMinHash: A New Minwise Hashing Algorithm for Jaccard Similarity Estimation
☆25Jan 1, 2018Updated 8 years ago
kiranvodrahalli / cos521
View on GitHub
Final project for COS 521: Using Hokusai algorithm to approximate frequency counts of hashtags in twitter data stream.
☆12Jan 13, 2015Updated 11 years ago
codahale / sskg
View on GitHub
A Go implementation of a fast, tree-based Seekable Sequential Key Generator.
☆24Nov 7, 2014Updated 11 years ago
sasha-s / go-IBLT
View on GitHub
implements invertible bloom filters in golang
☆16Feb 3, 2020Updated 6 years ago
Deploy on Railway without the complexity - Free Credits Offer • Ad
Connect your repo and Railway handles the rest with instant previews. Quickly provision container image services, databases, and storage volumes.
seomoz / url-py
View on GitHub
URL Transformation, Sanitization
☆104Jan 16, 2024Updated 2 years ago
leftnoteasy / pymining
View on GitHub
python data mining platform
☆16Jan 17, 2013Updated 13 years ago
dams / graphite-riakts
View on GitHub
drop-in replacement for graphite node using Riak TS
☆13Aug 6, 2016Updated 9 years ago
amandasaurus / pgindexrebuild
View on GitHub
Production friendly tool to get rid of index bloat in PostgreSQL
☆12Feb 8, 2023Updated 3 years ago
thejerf / strinterp
View on GitHub
Morally-correct string and stream interpolation for Go.
☆24Apr 23, 2016Updated 10 years ago
seomoz / qless-py
View on GitHub
Python Bindings for qless
☆47Sep 23, 2019Updated 6 years ago
bgithub1 / cme_open_interest
View on GitHub
ETL project to download and process both CME open interest data, COT data from the CFTC and NAV/shares-outstanding data from various ETF …
☆13Jul 13, 2021Updated 5 years ago
issuj / gofaster
View on GitHub
Faster alternatives for some Go stdlib packages
☆14Nov 9, 2017Updated 8 years ago
KIZI / LinkedHypernymsDataset
View on GitHub
☆14Aug 24, 2021Updated 4 years ago
Wordpress hosting with auto-scaling - Free Trial Offer • Ad
Fully Managed hosting for WordPress and WooCommerce businesses that need reliable, auto-scalable performance. Cloudways SafeUpdates now available.
armon / go-hlld
View on GitHub
Golang client for HyperLogLog daemon (hlld)
☆21Jan 31, 2016Updated 10 years ago
c00w / gevent-dht
View on GitHub
A dht based on gevent.
☆41Feb 25, 2015Updated 11 years ago
dreid / atomiclong
View on GitHub
A CFFI using AtomicLong type for CPython and PyPy.
☆23Jun 21, 2017Updated 9 years ago
andrewclegg / sketchy
View on GitHub
Simple approximate-nearest-neighbours in Python using locality sensitive hashing.
☆141Jun 21, 2012Updated 14 years ago
nvkelso / map-label-style-manual
View on GitHub
Abbreviations, nicknames, foreign terms, translations, transliterations, diacritical marks, suggested placements, and more
☆24Jul 18, 2012Updated 14 years ago
efficient / gobin-codegen
View on GitHub
Automatic codegen for encoding/binary marshaling
☆17Mar 14, 2015Updated 11 years ago
leemcloughlin / gofarmhash
View on GitHub
Port of Google's Farmhash version 1.0.0 to, pure, Go
☆17Sep 19, 2016Updated 9 years ago
alsemyonov / als_typograf
View on GitHub
Ruby client for ArtLebedevStudio.RemoteTypograf Web Service.
☆15Jan 10, 2016Updated 10 years ago
fogbeam / Neddick
View on GitHub
Neddick: Open Source Information Discovery Platform
☆36Mar 15, 2023Updated 3 years ago
1-Click AI Models by DigitalOcean Gradient • Ad
Deploy popular AI models on DigitalOcean Gradient GPU virtual machines with just a single click. Zero configuration with optimized deployments.
KamalaSowmya / DiscussionSummarization
View on GitHub
Discussion Summarization is the process of condensing a text document which is a collection of discussion threads, using CBS (Cluster Bas…
☆12Apr 10, 2014Updated 12 years ago
efficient / libcuckoo-c
View on GitHub
High-performance Concurrent Cuckoo Hashing Library
☆47Jan 20, 2015Updated 11 years ago
machinalis / telegraphy
View on GitHub
Telegraphy provides real time events for WSGI Python applications
☆202Jun 19, 2015Updated 11 years ago
yext / revere
View on GitHub
“One if by land, and two if by sea”—Alerting for Graphite
☆23Jul 12, 2023Updated 3 years ago
sauerbraten / radix
View on GitHub
An implementation of the radix tree data structure (http://en.wikipedia.org/wiki/Radix_tree).
☆21Feb 10, 2015Updated 11 years ago
sergey-melnychuk / uppercut
View on GitHub
Small and simple actor model implementation.
☆10Mar 7, 2026Updated 4 months ago
mmower / bishop
View on GitHub
A bayesian classifier library for Ruby
☆24Nov 1, 2011Updated 14 years ago