MNoorFawi / lshashingLinks
python library to perform Locality-Sensitive Hashing for faster nearest neighbors search in high dimensional data
☆19Updated last year
Alternatives and similar repositories for lshashing
Users that are interested in lshashing are comparing it to the libraries listed below
Sorting:
- A python package for running directed acyclic graphs of asynchronous I/O operations☆17Updated 4 years ago
- MirrorDataGenerator is a python tool that generates synthetic data based on user-specified causal relations among features in the data. I…☆25Updated 3 years ago
- machine learning model performance metrics & charts with confidence intervals, optimized with numba to be fast☆16Updated 4 years ago
- Model Validation Toolkit is a collection of tools to assist with validating machine learning models prior to deploying them to production…☆29Updated 2 years ago
- Python package for deduplication/entity resolution using active learning☆83Updated last year
- Efficient BM25 with DuckDB 🦆☆59Updated last year
- Fast fuzzy text search☆11Updated 2 years ago
- Distributed persistent Task Queue running on Dask☆38Updated 2 years ago
- Personalized Purchase Prediction of Market Baskets with Wasserstein-Based Sequence Matching☆19Updated 6 years ago
- Pipeline components that support partial_fit.☆46Updated last year
- SPEAR: Programmatically label and build training data quickly.☆109Updated last year
- Exploring the classical regression capabilities of LLMs.☆18Updated last year
- Neural Solr = Solr 9 + Mighty Inference + Node☆18Updated 3 years ago
- Deep Learning how-to's using Lance file format☆22Updated 7 months ago
- Reinforcement Learning Recommender System suggesting relevant scientific services to appropriate researchers☆11Updated last year
- A neural network hyper parameter tuner☆30Updated 2 years ago
- Implementation of the paper "Deep Indexed Active Learning for Matching Heterogeneous Entity Representations"☆17Updated 4 years ago
- Async bulk data ingestion and querying in various document, graph and vector databases via their Python clients☆40Updated 2 years ago
- Record matching and entity resolution at scale in Spark☆36Updated 2 years ago
- Public repository holding examples for dataheroes library☆25Updated 7 months ago
- Have UV deal with all your Jupyter deps.☆28Updated last year
- Examples of vector DB indexing and query with various vector databases.☆13Updated 10 months ago
- Meadowflow is a proof-of-concept/prototype job scheduler built to explore the idea of implicit data dependency management.☆10Updated 2 years ago
- Learn2Clean: Optimizing the Sequence of Tasks for Data Preparation and Cleaning☆53Updated 3 years ago
- A simple and streamlined Python script to extract and filter links from a remote HTML resource.☆24Updated 11 months ago
- Serverless for data practitioners. The fastest ⚡️ way to run your code in the cloud. Effortlessly run scripts, functions, and Jupyter not…☆41Updated last year
- Identify bias and measure fairness of your data☆95Updated 3 weeks ago
- Comparing Polars to Pandas and a small introduction☆44Updated 4 years ago
- Cloud-agnostic Python API☆61Updated last year
- A library to use `modal` as a backend for `joblib`.☆32Updated 11 months ago