All-pair set similarity search on millions of sets in Python and on a laptop
☆603Oct 11, 2022Updated 3 years ago
Alternatives and similar repositories for SetSimilaritySearch
Users that are interested in SetSimilaritySearch are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- Efficient set similarity search algorithms implemented in Go☆35Aug 27, 2022Updated 3 years ago
- Sketch and LSH Index library for Java, including OPH methods as well as the Lazo method☆15Dec 24, 2023Updated 2 years ago
- MinHash, LSH, LSH Forest, Weighted MinHash, HyperLogLog, HyperLogLog++, LSH Ensemble and HNSW☆2,928Updated this week
- Pampy: The Pattern Matching for Python you always dreamed of.☆3,527Jan 16, 2025Updated last year
- A natural language modeling framework based on PyTorch☆6,295Oct 17, 2022Updated 3 years ago
- Deploy open-source AI quickly and easily - Special Bonus Offer • AdRunpod Hub is built for open source. One-click deployment and autoscaling endpoints without provisioning your own infrastructure.
- 📐 Compute distance between sequences. 30+ algorithms, pure python implementation, common interface, optional external libs usage.☆3,533Apr 18, 2025Updated last year
- Fast word vectors with little memory usage in Python☆416Jun 26, 2021Updated 4 years ago
- Efficient Counter that uses a limited (bounded) amount of memory regardless of data size.☆932Nov 20, 2022Updated 3 years ago
- DartMinHash: Fast Sketching for Weighted Sets☆12Dec 8, 2025Updated 6 months ago
- Feature engineering and machine learning: together at last!☆26Jan 1, 2021Updated 5 years ago
- Small Image Library for Python 3☆415Dec 8, 2022Updated 3 years ago
- A very simple framework for state-of-the-art Natural Language Processing (NLP)☆14,378Oct 27, 2025Updated 7 months ago
- Funky takes shell functions to the next level by making them easier to define, more flexible, and more interactive.☆669Jul 15, 2025Updated 10 months ago
- Learning embeddings for classification, retrieval and ranking.☆3,953Dec 4, 2022Updated 3 years ago
- Managed hosting for WordPress and PHP on Cloudways • AdManaged hosting for WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Deploy in minutes on Cloudways by DigitalOcean.
- Quantized word vectors that take 8x-16x less space than regular word vectors☆753Mar 31, 2020Updated 6 years ago
- two strange things to do with neural nets☆15Feb 18, 2019Updated 7 years ago
- A python library for accurate and scalable fuzzy matching, record deduplication and entity-resolution.☆4,473Jul 29, 2025Updated 10 months ago
- Python library for building highly effective data science workflows☆948Jul 20, 2023Updated 2 years ago
- Approximate Nearest Neighbor Search for Sparse Data in Python!☆918Oct 2, 2020Updated 5 years ago
- Snips Python library to extract meaning from text☆3,971May 22, 2023Updated 3 years ago
- Python library that makes it easy for data scientists to create charts.☆3,636Oct 16, 2024Updated last year
- Python memoization across program runs.☆106Nov 19, 2018Updated 7 years ago
- ☆11Nov 17, 2017Updated 8 years ago
- Managed hosting for WordPress and PHP on Cloudways • AdManaged hosting for WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Deploy in minutes on Cloudways by DigitalOcean.
- Perform lexical analysis on words, one word at a time.☆64Jun 6, 2018Updated 8 years ago
- A context-preserving word cloud generator☆441Jul 6, 2023Updated 2 years ago
- Toolkit for Machine Learning, Natural Language Processing, and Text Generation, in TensorFlow. This is part of the CASL project: http://…☆2,392Aug 26, 2021Updated 4 years ago
- Tensorflow implementation of Facebook TagSpace☆74Jan 29, 2019Updated 7 years ago
- Locality Sensitive Hashing using MinHash in Python/Cython to detect near duplicate text documents☆290Jun 11, 2023Updated 3 years ago
- A model library for exploring state-of-the-art deep learning topologies and techniques for optimizing Natural Language Processing neural …☆2,934Nov 7, 2022Updated 3 years ago
- Concurrent data pipelines in Python >>>☆1,596Jul 20, 2023Updated 2 years ago
- A fast, efficient universal vector embedding utility package.☆1,662Aug 3, 2023Updated 2 years ago
- Approximate Nearest Neighbors in C++/Python optimized for memory usage and loading/saving to disk☆14,252Oct 29, 2025Updated 7 months ago
- 1-Click AI Models by DigitalOcean Gradient • AdDeploy popular AI models on DigitalOcean Gradient GPU virtual machines with just a single click. Zero configuration with optimized deployments.
- GNES is Generic Neural Elastic Search, a cloud-native semantic search system based on deep neural network.☆1,265Oct 31, 2019Updated 6 years ago
- ☆3,170Nov 16, 2021Updated 4 years ago
- Out-of-Core hybrid Apache Arrow/NumPy DataFrame for Python, ML, visualization and exploration of big tabular data at a billion rows per s…☆8,504Apr 1, 2026Updated 2 months ago
- FAst Lookups of Cosine and Other Nearest Neighbors (based on fast locality-sensitive hashing)☆1,160Jun 1, 2024Updated 2 years ago
- Python Fast Dataflow programming framework for Data pipeline work( Web Crawler,Machine Learning,Quantitative Trading.etc)☆1,196Feb 3, 2026Updated 4 months ago
- A Keras model that addresses the Quora Question Pairs dyadic prediction task.☆14Feb 18, 2017Updated 9 years ago
- An experiment about re-implementing supervised learning models based on shallow neural network approaches (e.g. fastText) with some addit…☆198Aug 8, 2017Updated 8 years ago