ekzhu / datasketch
MinHash, LSH, LSH Forest, Weighted MinHash, HyperLogLog, HyperLogLog++, LSH Ensemble and HNSW
☆2,635Updated 7 months ago
Alternatives and similar repositories for datasketch:
Users that are interested in datasketch are comparing it to the libraries listed below
- FAst Lookups of Cosine and Other Nearest Neighbors (based on fast locality-sensitive hashing)☆1,147Updated 7 months ago
- Non-Metric Space Library (NMSLIB): An efficient similarity search library and a toolkit for evaluation of k-NN methods for generic non-me…☆3,439Updated 3 months ago
- A fast Python implementation of locality sensitive hashing.☆661Updated 4 years ago
- Locality Sensitive Hashing using MinHash in Python/Cython to detect near duplicate text documents☆285Updated last year
- A Python Implementation of Simhash Algorithm☆995Updated 2 years ago
- Benchmarks of approximate nearest neighbor libraries in Python☆5,060Updated 3 weeks ago
- Approximate Nearest Neighbor Search for Sparse Data in Python!☆919Updated 4 years ago
- Python module (C extension and plain python) implementing Aho-Corasick algorithm☆967Updated 9 months ago
- Python framework for fast (approximated) nearest neighbour search in large, high-dimensional data sets using different locality-sensitive…☆767Updated last year
- Learning to Rank in TensorFlow☆2,758Updated 9 months ago
- sentence embedding by Smooth Inverse Frequency weighting scheme☆1,086Updated 5 years ago
- A Collection of BM25 Algorithms in Python☆1,082Updated 3 months ago
- A system for quickly generating training data with weak supervision☆5,826Updated 8 months ago
- Deep recommender models using PyTorch.☆2,999Updated 2 years ago
- Python library for interactive topic model visualization. Port of the R LDAvis package.☆1,813Updated 6 months ago
- Learning embeddings for classification, retrieval and ranking.☆3,952Updated 2 years ago
- A python tool for evaluating the quality of sentence embeddings.☆2,091Updated 9 months ago
- A Python scikit for building and analyzing recommender systems☆6,468Updated 7 months ago
- A library implementing different string similarity and distance measures using Python.☆998Updated 2 years ago
- All-pair set similarity search on millions of sets in Python and on a laptop☆591Updated 2 years ago
- Compute Sentence Embeddings Fast!☆618Updated last year
- A Python implementation of LightFM, a hybrid recommendation algorithm.☆4,819Updated 5 months ago
- Python implementation of TextRank algorithms ("textgraphs") for phrase extraction☆2,161Updated 6 months ago
- 🦆 Contextually-keyed word vectors☆1,633Updated 10 months ago
- Header-only C++/python library for fast approximate nearest neighbors☆4,490Updated 5 months ago
- Simhash and near-duplicate detection☆413Updated last year
- General purpose unsupervised sentence representations☆1,198Updated 2 years ago
- ☆3,158Updated 3 years ago
- Navigating Spreading-out Graph For Approximate Nearest Neighbor Search☆653Updated last year
- Anserini is a Lucene toolkit for reproducible information retrieval research☆1,041Updated this week