ekzhu / datasketch
MinHash, LSH, LSH Forest, Weighted MinHash, HyperLogLog, HyperLogLog++, LSH Ensemble and HNSW
☆2,561Updated 5 months ago
Related projects ⓘ
Alternatives and complementary repositories for datasketch
- Non-Metric Space Library (NMSLIB): An efficient similarity search library and a toolkit for evaluation of k-NN methods for generic non-me…☆3,410Updated last month
- A fast Python implementation of locality sensitive hashing.☆661Updated 4 years ago
- FAst Lookups of Cosine and Other Nearest Neighbors (based on fast locality-sensitive hashing)☆1,137Updated 5 months ago
- Benchmarks of approximate nearest neighbor libraries in Python☆4,956Updated last week
- Python framework for fast (approximated) nearest neighbour search in large, high-dimensional data sets using different locality-sensitive…☆766Updated last year
- A Python Implementation of Simhash Algorithm☆980Updated 2 years ago
- Locality Sensitive Hashing using MinHash in Python/Cython to detect near duplicate text documents☆281Updated last year
- Header-only C++/python library for fast approximate nearest neighbors☆4,361Updated 2 months ago
- Learning embeddings for classification, retrieval and ranking.☆3,944Updated last year
- All-pair set similarity search on millions of sets in Python and on a laptop☆590Updated 2 years ago
- Python module (C extension and plain python) implementing Aho-Corasick algorithm☆949Updated 7 months ago
- A fast, efficient universal vector embedding utility package.☆1,627Updated last year
- ☆3,148Updated 2 years ago
- Fast Python Collaborative Filtering for Implicit Feedback Datasets☆3,555Updated 3 months ago
- Approximate Nearest Neighbor Search for Sparse Data in Python!☆916Updated 4 years ago
- Nearest Neighbor Search with Neighborhood Graph and Tree for High-dimensional Data☆1,256Updated this week
- Python library for interactive topic model visualization. Port of the R LDAvis package.☆1,805Updated 4 months ago
- Simhash and near-duplicate detection☆409Updated last year
- Approximate Nearest Neighbors in C++/Python optimized for memory usage and loading/saving to disk☆13,237Updated 3 months ago
- Training of Locally Optimized Product Quantization (LOPQ) models for approximate nearest neighbor search of high dimensional data in Pyth…☆563Updated 5 years ago
- A Python nearest neighbor descent for approximate nearest neighbors☆888Updated 4 months ago
- A system for quickly generating training data with weak supervision☆5,807Updated 6 months ago
- DeepWalk - Deep Learning for Graphs☆2,678Updated last year
- Facilitating the design, comparison and sharing of deep text matching models.☆3,840Updated 3 months ago
- Learning to Rank in TensorFlow☆2,742Updated 7 months ago
- Example Python code for comparing documents using MinHash☆250Updated 5 years ago
- Topic modeling with latent Dirichlet allocation using Gibbs sampling☆1,236Updated 3 months ago
- NLP, before and after spaCy☆2,215Updated last year
- Generate embeddings from large-scale graph-structured data.☆3,381Updated 8 months ago
- Deep recommender models using PyTorch.☆2,989Updated last year