ekzhu / datasketchLinks
MinHash, LSH, LSH Forest, Weighted MinHash, HyperLogLog, HyperLogLog++, LSH Ensemble and HNSW
☆2,730Updated last year
Alternatives and similar repositories for datasketch
Users that are interested in datasketch are comparing it to the libraries listed below
Sorting:
- A fast Python implementation of locality sensitive hashing.☆664Updated 5 years ago
- FAst Lookups of Cosine and Other Nearest Neighbors (based on fast locality-sensitive hashing)☆1,152Updated last year
- Non-Metric Space Library (NMSLIB): An efficient similarity search library and a toolkit for evaluation of k-NN methods for generic non-me…☆3,511Updated 9 months ago
- Locality Sensitive Hashing using MinHash in Python/Cython to detect near duplicate text documents☆289Updated 2 years ago
- A Python Implementation of Simhash Algorithm☆1,019Updated 3 years ago
- Benchmarks of approximate nearest neighbor libraries in Python☆5,332Updated 3 weeks ago
- Python module (C extension and plain python) implementing Aho-Corasick algorithm☆1,007Updated 2 weeks ago
- Python implementation of TextRank algorithms ("textgraphs") for phrase extraction☆2,183Updated 3 weeks ago
- Python framework for fast (approximated) nearest neighbour search in large, high-dimensional data sets using different locality-sensitive…☆767Updated 2 years ago
- A system for quickly generating training data with weak supervision☆5,877Updated last year
- All-pair set similarity search on millions of sets in Python and on a laptop☆597Updated 2 years ago
- Approximate Nearest Neighbor Search for Sparse Data in Python!☆919Updated 4 years ago
- Learning embeddings for classification, retrieval and ranking.☆3,954Updated 2 years ago
- The Levenshtein Python C extension module contains functions for fast computation of Levenshtein distance and string similarity☆1,273Updated 3 years ago
- Topic Modelling for Humans☆16,079Updated 3 weeks ago
- Heuristic based boilerplate removal tool☆783Updated 4 months ago
- ☆3,168Updated 3 years ago
- A library implementing different string similarity and distance measures using Python.☆1,014Updated 2 years ago
- Header-only C++/python library for fast approximate nearest neighbors☆4,773Updated last week
- Commented (but unaltered) version of original word2vec C implementation.☆800Updated 4 years ago
- Fast Python Collaborative Filtering for Implicit Feedback Datasets☆3,681Updated 11 months ago
- 🪼 a python library for doing approximate and phonetic matching of strings.☆2,144Updated 2 weeks ago
- Nearest Neighbor Search with Neighborhood Graph and Tree for High-dimensional Data☆1,310Updated last week
- Python package for performing Entity and Text Matching using Deep Learning.☆594Updated last year
- NLP, before and after spaCy☆2,227Updated last year
- A library for k-nearest neighbor search☆385Updated last year
- Distributed Asynchronous Hyperparameter Optimization in Python☆7,440Updated last month
- Navigating Spreading-out Graph For Approximate Nearest Neighbor Search☆680Updated last year
- Example Python code for comparing documents using MinHash☆251Updated 6 years ago
- Simhash and near-duplicate detection☆416Updated 2 years ago