A Locality Sensitive Hashing (LSH) library with an emphasis on large, highly-dimensional datasets.
☆149Sep 4, 2024Updated last year
Alternatives and similar repositories for SparseLSH
Users that are interested in SparseLSH are comparing it to the libraries listed below
Sorting:
- A fast Python implementation of locality sensitive hashing.☆674Apr 30, 2020Updated 5 years ago
- Python framework for fast (approximated) nearest neighbour search in large, high-dimensional data sets using different locality-sensitive…☆771Feb 23, 2023Updated 3 years ago
- Locality Sensitive Hashing using MinHash in Python/Cython to detect near duplicate text documents☆293Jun 11, 2023Updated 2 years ago
- Weighted MinHash implementation on CUDA (multi-gpu).☆121Nov 29, 2023Updated 2 years ago
- spaCy-to-naf converter☆21Jun 10, 2025Updated 8 months ago
- A Text Comprehension Engine in Python☆15Aug 23, 2015Updated 10 years ago
- A pure python implementation of locality sensitive hashing for text documents☆87Oct 24, 2015Updated 10 years ago
- A ROS1/ROS2 compatible, RDFlib-backed knowledge base for robotic application. Mostly KB-API conformant.☆16Sep 12, 2025Updated 5 months ago
- Tweets annotated with coarse-grained sense labels (supersenses)☆13Jun 13, 2014Updated 11 years ago
- Approximate Nearest Neighbor Search for Sparse Data in Python!☆920Oct 2, 2020Updated 5 years ago
- Content-based Recommendation Generator☆13Jan 21, 2015Updated 11 years ago
- LSH based high dimensional clustering for sets and points☆80Nov 15, 2014Updated 11 years ago
- Python Approximate Nearest Neighbor Search in very high dimensional spaces with optimised indexing.☆216Oct 7, 2021Updated 4 years ago
- python3 package supporting efficient storage and querying of sets of sets using the trie data structure. Supports finding all the superse…☆23Sep 15, 2023Updated 2 years ago
- Semanticizest: dump parser and client☆20May 11, 2016Updated 9 years ago
- common data interchange format for document processing pipelines that apply natural language processing tools to large streams of text☆35Sep 30, 2016Updated 9 years ago
- collection of modules to build distributed and reliable concurrent systems in Python.☆206Sep 14, 2013Updated 12 years ago
- An attempt at creating a gold standard dataset for backtesting yesterday & today's content-extractors☆35Mar 19, 2015Updated 10 years ago
- A method to mine beyond-pairwise relationships using Min-Hashing for large-scale pattern discovery☆28Oct 10, 2021Updated 4 years ago
- A simple multicohort LTV calculator for subscriptions☆11Mar 7, 2023Updated 2 years ago
- Natural language hashing library.☆10Nov 24, 2014Updated 11 years ago
- A semantic web crawler☆20Sep 20, 2010Updated 15 years ago
- large-memory key-value pair store for Python☆50May 26, 2013Updated 12 years ago
- A c++ implementation of the Two-Pass Pairing Heap data structure.☆11Oct 9, 2016Updated 9 years ago
- This project deals with hierarchical classification of web pages based on dmoz dataset.☆14Apr 10, 2014Updated 11 years ago
- A module to serialze python objects to json api compatible messages and also deserialize json api messages back to python objects.☆10Feb 9, 2026Updated 2 weeks ago
- Implicit relation extractor using a natural language model.☆24May 25, 2018Updated 7 years ago
- A Cython interface to FLANN☆24Nov 25, 2020Updated 5 years ago
- Simple approximate-nearest-neighbours in Python using locality sensitive hashing.☆141Jun 21, 2012Updated 13 years ago
- different types of tutorials, such as machine learning, image processing and etc.☆102Apr 3, 2016Updated 9 years ago
- implementations of a counting bloom, a timing bloom and a scaling timing bloom... made for working with streams!☆42Feb 1, 2017Updated 9 years ago
- A queue-controlled browser automation tool for improving web crawl quality☆64Aug 13, 2025Updated 6 months ago
- A c++ toolbox of locality-sensitive hashing (LSH), provides several popular LSH algorithms, also support python and matlab.☆293Jun 29, 2017Updated 8 years ago
- Variational Information Maximization for Feature Selection☆11Aug 24, 2016Updated 9 years ago
- Failover AWS Spot Instances☆11Dec 8, 2017Updated 8 years ago
- My fork of zerofrog's fast SIFT C++ reimplementation of Bill Lowe's original smash-hit image-analysis algorithm.☆21Sep 19, 2012Updated 13 years ago
- ☆10Dec 3, 2020Updated 5 years ago
- A platform for collecting, analyzing, and visualizing social media data.☆13Dec 27, 2020Updated 5 years ago
- A system for disambiguating toponyms (placenames) given textual context and creating visualizations of the locations referenced in a give…☆19Jul 24, 2013Updated 12 years ago