chrisjmccormick / MinHash
Example Python code for comparing documents using MinHash
☆251Updated 5 years ago
Alternatives and similar repositories for MinHash:
Users that are interested in MinHash are comparing it to the libraries listed below
- Locality Sensitive Hashing using MinHash in Python/Cython to detect near duplicate text documents☆283Updated last year
- Simhash and near-duplicate detection☆413Updated last year
- A Locality Sensitive Hashing (LSH) library with an emphasis on large, highly-dimensional datasets.☆146Updated 4 months ago
- LSH based high dimensional clustering for sets and points☆78Updated 10 years ago
- A pure python implementation of locality sensitive hashing for text documents☆86Updated 9 years ago
- Various gfx for a presentation at NYC ML meetup☆59Updated 9 years ago
- Estimating how similar are two sets using MinHash (Jaccard similarity coefficient)☆30Updated 11 years ago
- An efficient simhash implementation for python☆124Updated 5 years ago
- Python framework for fast (approximated) nearest neighbour search in large, high-dimensional data sets using different locality-sensitive…☆766Updated last year
- Implementation of the PageRank algorithm☆174Updated 7 years ago
- Code for "Performance shootout between nearest-neighbour libraries": http://radimrehurek.com/2013/11/performance-shootout-of-nearest-neig…☆99Updated 9 years ago
- Open Source Implementation of Simhash in Python☆24Updated 7 years ago
- FAst Lookups of Cosine and Other Nearest Neighbors (based on fast locality-sensitive hashing)☆1,147Updated 7 months ago
- Text classification example in Python using Latent Semantic Analysis (LSA)☆105Updated 6 years ago
- It is a forest of random projection trees☆224Updated 4 years ago
- ☆141Updated 5 years ago
- Weighted MinHash implementation on CUDA (multi-gpu).☆117Updated last year
- ☆187Updated 8 months ago
- Topic modeling with gensim and LDA☆168Updated 7 years ago
- Neural Learning to Rank using Chainer☆31Updated 4 years ago
- All-pair set similarity search on millions of sets in Python and on a laptop☆592Updated 2 years ago
- Collaborative modeling for recommendation. Implements variational inference for a collaborative topic models. These models recommend item…☆147Updated 9 years ago
- Instructions & code for the EuroPython 2014 training session "Topic Modeling for Fun and Profit"☆110Updated 10 years ago
- Collection of some algorithms for entity resolution☆28Updated 9 years ago
- A Python implementation of the BM25 ranking function.☆236Updated 5 years ago
- A library for k-nearest neighbor search☆383Updated 9 months ago
- Word Mover's Distance from Matthew J Kusner's paper "From Word Embeddings to Document Distances"☆537Updated 7 months ago
- Some add-on modules to networkx library☆78Updated 4 years ago
- Implementation of various topic models☆369Updated 4 years ago
- Deep Learning for Natural Language Processing☆457Updated 6 years ago