vilda / shashLinks
Similarity hashing
β49Updated 14 years ago
Alternatives and similar repositories for shash
Users that are interested in shash are comparing it to the libraries listed below
Sorting:
- π SQLite extension to add the Okapi BM25 ranking algorithmβ35Updated 10 years ago
 - A high performance search engineβ107Updated 8 years ago
 - A crawler, indexer, and query interface all in Python with distributed processing via Pyro4.β23Updated 13 years ago
 - Simhashing in C++β135Updated 2 years ago
 - Suite of tools for detecting changes in web pages and their renderingβ55Updated last year
 - Metric tree demoβ14Updated 11 years ago
 - Download Hacker News (HN) stories and comments using their official APIsβ87Updated 5 years ago
 - A simple bloom filter for SQLite using Murmur3β18Updated 14 years ago
 - Discussion Summarization is the process of condensing a text document which is a collection of discussion threads, using CBS (Cluster Basβ¦β12Updated 11 years ago
 - A cluster implementation of simhash near-duplicate detectionβ32Updated 10 years ago
 - β13Updated 9 years ago
 - A general purpose graph libraryβ11Updated 7 years ago
 - Python API for Various DB-Backed Simhash Clustersβ64Updated 8 years ago
 - A framework for building reranking models.β28Updated 10 years ago
 - Wrapper to pocketsphinx phoneme labeling toolsβ18Updated 9 years ago
 - A vector similarity databaseβ230Updated 11 years ago
 - Redis is an in-memory database that persists on disk. The data model is key-value, but many different kind of values are supported: Strinβ¦β129Updated 11 years ago
 - Near-duplicate detection toolβ24Updated 8 years ago
 - A light weight, low level embedded key-value database libraryβ32Updated 12 years ago
 - A Text Comprehension Engine in Pythonβ15Updated 10 years ago
 - memcachedb ported from BerkeleyDB to LMDB originally from http://memcachedb.googlecode.com/svn/trunkβ89Updated 10 years ago
 - Python Userspace TCP/IP Stack (historic upload from 2005)β55Updated 15 years ago
 - β33Updated 5 years ago
 - A pure C implementation of the Geohash algorithm.β108Updated 6 years ago
 - Tools for web page segmentation. In developmentβ17Updated 6 years ago
 - Focused Crawler for VT's CTRNetβ10Updated 12 years ago
 - Persistent Storage for Ice Objectsβ24Updated last month
 - Reduced on-disk Suffix Arrayβ22Updated 12 years ago
 - Text (source code) search engine with indexer and a front end web interface to search. Uses Python 3.β126Updated 2 years ago
 - Library implementing the storage and the query evaluation for a text search engine. It uses on a key value store database interface to stβ¦β47Updated 4 years ago