ekzhu / datasketchLinks
MinHash, LSH, LSH Forest, Weighted MinHash, HyperLogLog, HyperLogLog++, LSH Ensemble and HNSW
☆2,709Updated last year
Alternatives and similar repositories for datasketch
Users that are interested in datasketch are comparing it to the libraries listed below
Sorting:
- Non-Metric Space Library (NMSLIB): An efficient similarity search library and a toolkit for evaluation of k-NN methods for generic non-me…☆3,491Updated 8 months ago
- FAst Lookups of Cosine and Other Nearest Neighbors (based on fast locality-sensitive hashing)☆1,151Updated last year
- Benchmarks of approximate nearest neighbor libraries in Python☆5,288Updated last month
- A fast Python implementation of locality sensitive hashing.☆664Updated 5 years ago
- Locality Sensitive Hashing using MinHash in Python/Cython to detect near duplicate text documents☆287Updated last year
- Learning embeddings for classification, retrieval and ranking.☆3,953Updated 2 years ago
- A system for quickly generating training data with weak supervision☆5,861Updated last year
- A fast, efficient universal vector embedding utility package.☆1,645Updated last year
- Header-only C++/python library for fast approximate nearest neighbors☆4,719Updated last month
- Generate embeddings from large-scale graph-structured data.☆3,413Updated last year
- Python module (C extension and plain python) implementing Aho-Corasick algorithm☆1,001Updated last year
- Petastorm library enables single machine or distributed training and evaluation of deep learning models from datasets in Apache Parquet f…☆1,839Updated last year
- NLP made easy☆2,559Updated last year
- GNES is Generic Neural Elastic Search, a cloud-native semantic search system based on deep neural network.☆1,265Updated 5 years ago
- Learning to Rank in TensorFlow☆2,776Updated last year
- A python tool for evaluating the quality of sentence embeddings.☆2,108Updated last year
- Python framework for fast (approximated) nearest neighbour search in large, high-dimensional data sets using different locality-sensitive…☆767Updated 2 years ago
- Approximate Nearest Neighbor Search for Sparse Data in Python!☆919Updated 4 years ago
- A Python Implementation of Simhash Algorithm☆1,013Updated 3 years ago
- PyTorch original implementation of Cross-lingual Language Model Pretraining.☆2,907Updated 2 years ago
- Deep recommender models using PyTorch.☆3,020Updated 2 years ago
- A library for k-nearest neighbor search☆386Updated last year
- All-pair set similarity search on millions of sets in Python and on a laptop☆596Updated 2 years ago
- A library for Multilingual Unsupervised or Supervised word Embeddings☆3,218Updated 2 years ago
- Python implementation of TextRank algorithms ("textgraphs") for phrase extraction☆2,176Updated 10 months ago
- Stanford NLP Python library for tokenization, sentence segmentation, NER, and parsing of many human languages☆7,475Updated this week
- [EMNLP 2021] SimCSE: Simple Contrastive Learning of Sentence Embeddings https://arxiv.org/abs/2104.08821☆3,560Updated 7 months ago
- InferSent sentence embeddings☆2,284Updated 3 years ago
- Nearest Neighbor Search with Neighborhood Graph and Tree for High-dimensional Data☆1,307Updated this week
- Static memory-efficient Trie-like structures for Python based on marisa-trie C++ library.☆1,081Updated 2 weeks ago