ekzhu / datasketchLinks
MinHash, LSH, LSH Forest, Weighted MinHash, HyperLogLog, HyperLogLog++, LSH Ensemble and HNSW
☆2,868Updated 2 weeks ago
Alternatives and similar repositories for datasketch
Users that are interested in datasketch are comparing it to the libraries listed below
Sorting:
- Non-Metric Space Library (NMSLIB): An efficient similarity search library and a toolkit for evaluation of k-NN methods for generic non-me…☆3,567Updated 3 weeks ago
- FAst Lookups of Cosine and Other Nearest Neighbors (based on fast locality-sensitive hashing)☆1,156Updated last year
- Benchmarks of approximate nearest neighbor libraries in Python☆5,587Updated 7 months ago
- A fast Python implementation of locality sensitive hashing.☆674Updated 5 years ago
- Python module (C extension and plain python) implementing Aho-Corasick algorithm☆1,079Updated last month
- Python framework for fast (approximated) nearest neighbour search in large, high-dimensional data sets using different locality-sensitive…☆770Updated 2 years ago
- Header-only C++/python library for fast approximate nearest neighbors☆5,082Updated 4 months ago
- A Python Implementation of Simhash Algorithm☆1,033Updated 3 years ago
- Approximate Nearest Neighbor Search for Sparse Data in Python!☆919Updated 5 years ago
- Example Python code for comparing documents using MinHash☆251Updated 6 years ago
- Learning embeddings for classification, retrieval and ranking.☆3,957Updated 3 years ago
- Approximate Nearest Neighbors in C++/Python optimized for memory usage and loading/saving to disk☆14,142Updated 3 months ago
- A python tool for evaluating the quality of sentence embeddings.☆2,106Updated last year
- Python library for interactive topic model visualization. Port of the R LDAvis package.☆1,846Updated 2 months ago
- A high performance implementation of HDBSCAN clustering.☆3,058Updated last week
- A fast, efficient universal vector embedding utility package.☆1,651Updated 2 years ago
- A Python nearest neighbor descent for approximate nearest neighbors☆958Updated last month
- Static memory-efficient Trie-like structures for Python based on marisa-trie C++ library.☆1,123Updated last month
- Python Keyphrase Extraction module☆1,586Updated 2 years ago
- Locality Sensitive Hashing using MinHash in Python/Cython to detect near duplicate text documents☆292Updated 2 years ago
- A library implementing different string similarity and distance measures using Python.☆1,021Updated 3 years ago
- A Collection of BM25 Algorithms in Python☆1,307Updated last year
- 🦆 Contextually-keyed word vectors☆1,671Updated 9 months ago
- Python implementation of TextRank algorithms ("textgraphs") for phrase extraction☆2,208Updated last week
- Toy Python implementation of http://www-nlp.stanford.edu/projects/glove/☆1,258Updated 3 years ago
- ☆3,170Updated 4 years ago
- Training of Locally Optimized Product Quantization (LOPQ) models for approximate nearest neighbor search of high dimensional data in Pyth…☆568Updated 6 years ago
- A python library for accurate and scalable fuzzy matching, record deduplication and entity-resolution.☆4,433Updated 6 months ago
- ☆1,257Updated last year
- Nearest Neighbor Search with Neighborhood Graph and Tree for High-dimensional Data☆1,348Updated 2 weeks ago