Fuzzy matching function in spark (https://spark-packages.org/package/itspawanbhardwaj/spark-fuzzy-matching)
☆24Dec 30, 2019Updated 6 years ago
Alternatives and similar repositories for spark-fuzzy-matching
Users that are interested in spark-fuzzy-matching are comparing it to the libraries listed below
Sorting:
- ☆15Aug 11, 2022Updated 3 years ago
- Spark functions to run popular phonetic and string matching algorithms☆59Feb 22, 2022Updated 4 years ago
- Scrapes job data from Glassdoor. Fast and free Glassdoor Scraper to extract all data from job listings including salaries, companies, and…☆17Dec 20, 2023Updated 2 years ago
- Data engineering pipeline for the household COVID-19 Infection Survey (CIS)☆10Jul 18, 2023Updated 2 years ago
- Material for the lecture Statistical Computing☆11Jan 1, 2026Updated 2 months ago
- Record matching and entity resolution at scale in Spark☆36Oct 31, 2023Updated 2 years ago
- PDF to JSON, JSON to PDF and etc.☆12Apr 18, 2018Updated 7 years ago
- Stopwatch timer on JS☆11Sep 4, 2023Updated 2 years ago
- phData Pulse application log aggregation and monitoring☆13Apr 13, 2020Updated 5 years ago
- This web scraper is intended to extract data from The Home Depot Website, it could be run locally or in the Apify platform, the latter is…☆10Oct 13, 2022Updated 3 years ago
- Package provides java implementation of the latent dirichlet allocation (LDA) for topic modelling☆10May 18, 2017Updated 8 years ago
- Code Repository for Technical Program Manager's Handbook 2E, Published by Packt Publishing☆16Sep 25, 2024Updated last year
- An awesome list that curates the best Flet tools, tutorials, blogs and more.☆10Jan 8, 2023Updated 3 years ago
- ☆14Jul 8, 2025Updated 8 months ago
- ☆14Feb 10, 2023Updated 3 years ago
- Implemention based on lightrag and nano-graphrag to connect with psql☆15Oct 28, 2024Updated last year
- A set of Jupyter Lab Notebooks and Other Implementations of Community Reports in Standard Form☆18Apr 15, 2024Updated last year
- Using the Parquet file format (with Avro) to process data with Apache Flink☆14Aug 17, 2015Updated 10 years ago
- sbt plugin for scala modules.☆14Mar 3, 2026Updated last week
- ☆10Sep 14, 2023Updated 2 years ago
- Plutus for the masses☆11Jan 20, 2023Updated 3 years ago
- Friday Forecasting Talks materials☆11May 24, 2024Updated last year
- vntokenizer 4.1 by LE-HONG Phuong☆11Dec 13, 2016Updated 9 years ago
- ☆12Apr 17, 2024Updated last year
- Hadoop InputFormat for http://druid.io/☆10Oct 26, 2016Updated 9 years ago
- ☆10Feb 2, 2023Updated 3 years ago
- ☆12Mar 12, 2024Updated last year
- This repo is a curated list of places I consider for weekends in Athens with my kid.☆11Dec 19, 2021Updated 4 years ago
- Repository for the family history/pedigree project☆13Feb 24, 2026Updated 2 weeks ago
- Public repository for the biodata resource inventory performed in 2022.☆11Nov 25, 2025Updated 3 months ago
- Meet Rustacean GPT, an experimental project transforming OpenAi's GPT into a helpful, autonomous software engineer to support senior deve…☆14May 10, 2023Updated 2 years ago
- Bringing up Docker Compose environments for system, integration and performance testing, with support for ScalaTest and Gatling☆11Jul 29, 2021Updated 4 years ago
- Blocking records for record linkage and data deduplication based on ANN algorithms in Python.☆19Nov 28, 2025Updated 3 months ago
- Interior Point Conic Optimization Solver☆10Feb 28, 2026Updated last week
- JOSM Plugin to visualize Atlas data☆15Oct 29, 2020Updated 5 years ago
- JupyterLab Notebook for Mesosphere DC/OS☆11Aug 6, 2019Updated 6 years ago
- Small examples of creative use of our projects☆11Jul 30, 2025Updated 7 months ago
- ☆15Jun 30, 2023Updated 2 years ago
- AppSync Events frontend sample implementation☆12Nov 16, 2024Updated last year