Record matching and entity resolution at scale in Spark
☆36Oct 31, 2023Updated 2 years ago
Alternatives and similar repositories for spark-matcher
Users that are interested in spark-matcher are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- Repository for performing Blocking using Deep Learning based on the paper "Deep Learning for Blocking in Entity Matching: A Design Space …☆30Apr 5, 2023Updated 3 years ago
- An End-to-End Evaluation Framework for Entity Resolution Systems☆36Dec 3, 2023Updated 2 years ago
- ☆15Aug 11, 2022Updated 3 years ago
- Spark Monitoring☆13Feb 28, 2023Updated 3 years ago
- SHAP-based validation for linear and tree-based models. Applied to binary, multiclass and regression problems.☆152Apr 19, 2025Updated last year
- 1-Click AI Models by DigitalOcean Gradient • AdDeploy popular AI models on DigitalOcean Gradient GPU virtual machines with just a single click. Zero configuration with optimized deployments.
- Scrapes job data from Glassdoor. Fast and free Glassdoor Scraper to extract all data from job listings including salaries, companies, and…☆17Dec 20, 2023Updated 2 years ago
- Fuzzy matching function in spark (https://spark-packages.org/package/itspawanbhardwaj/spark-fuzzy-matching)☆24Dec 30, 2019Updated 6 years ago
- Monitor the stability of a Pandas or Spark dataframe ⚙︎☆511Jan 9, 2026Updated 4 months ago
- ☆21Mar 26, 2017Updated 9 years ago
- ☆10Jun 29, 2021Updated 4 years ago
- Web Scraping, Document Deduplication & GPT-2 Fine-tuning with a newly created scam dataset.☆28Oct 30, 2021Updated 4 years ago
- Stanford Entity-Resolution Framework☆24Jun 23, 2018Updated 7 years ago
- Ordeq simplifies IO and modularizes pipeline logic.☆41Dec 19, 2025Updated 4 months ago
- Soufflé Datalog Language Server. Add smart features to the Soufflé Datalog Language with the help of LSP in a VS code plugin☆15Sep 30, 2023Updated 2 years ago
- Deploy on Railway without the complexity - Free Credits Offer • AdConnect your repo and Railway handles the rest with instant previews. Quickly provision container image services, databases, and storage volumes.
- Bluetooth Indoor Positioning with DNNs☆13Mar 28, 2022Updated 4 years ago
- Continuous Benchmark of Filtering methods for Entity Resolution☆11Jul 20, 2025Updated 9 months ago
- UI for JedAI Toolkit☆17May 20, 2022Updated 3 years ago
- ☆18Nov 9, 2025Updated 6 months ago
- SparkER: an Entity Resolution framework for Apache Spark☆65Mar 29, 2024Updated 2 years ago
- An example Jupyter Book project integrated with Read the Docs☆20Jan 12, 2026Updated 3 months ago
- Replication codes for Deep Learning Credit Risk Modeling by Manzo, Qiao☆21May 9, 2022Updated 4 years ago
- A Generalized Data Cleaning System☆51Apr 28, 2016Updated 10 years ago
- Implementation of algorithms from the paper "Globally-Consistent Rule-Based Summary-Explanations for Machine Learning Models: Application…☆24Jun 4, 2022Updated 3 years ago
- AI Agents on DigitalOcean Gradient AI Platform • AdBuild production-ready AI agents using customizable tools or access multiple LLMs through a single endpoint. Create custom knowledge bases or connect external data.
- Bachelor's Thesis on Adversarial Machine Learning Attacks and Defences☆17Nov 18, 2022Updated 3 years ago
- Scripts to install nodejs, lessc, newrelic, ..., on alwaysdata servers☆13Dec 10, 2019Updated 6 years ago
- Postgresql 11 Cluster for Docker Compose & -Swarm Stack, e.g. within Portainer☆12Jun 26, 2021Updated 4 years ago
- ☆11Aug 20, 2024Updated last year
- Docker Monitoring and Management Client☆26Feb 12, 2015Updated 11 years ago
- Create and manipulate Tableau Hyper files from Apache Spark DataFrames and Spark SQL☆31Jan 8, 2026Updated 4 months ago
- Helpers & syntactic sugar for PySpark.☆62Dec 4, 2025Updated 5 months ago
- Docker Compose files for private WebPagetest instance.☆14Mar 10, 2019Updated 7 years ago
- JedAI-WebApp is a GUI that facilitates the execution of JedAI. JedAI is an open source, high scalability toolkit that offers out-of-the-b…☆26Apr 14, 2023Updated 3 years ago
- Deploy to Railway using AI coding agents - Free Credits Offer • AdUse Claude Code, Codex, OpenCode, and more. Autonomous software development now has the infrastructure to match with Railway.
- A collection of python utility functions☆11Apr 30, 2026Updated last week
- ☆13Feb 10, 2023Updated 3 years ago
- mrhyde-tools gem - static site quick starter script wizard .:. jekyll command line tool☆14Aug 2, 2022Updated 3 years ago
- ☆10Feb 2, 2023Updated 3 years ago
- A swarm of LLM agents that will help you test, document, and productionize your code!☆18Apr 27, 2026Updated last week
- My presentation at ODSC India 2018 about Deep Learning with Apache Spark☆27Sep 1, 2018Updated 7 years ago
- Samples of authenticating to an Azure Key Vault vault☆13May 10, 2022Updated 3 years ago