itspawanbhardwaj / spark-fuzzy-matching
Fuzzy matching function in spark (https://spark-packages.org/package/itspawanbhardwaj/spark-fuzzy-matching)
☆24Updated 5 years ago
Alternatives and similar repositories for spark-fuzzy-matching:
Users that are interested in spark-fuzzy-matching are comparing it to the libraries listed below
- Spark functions to run popular phonetic and string matching algorithms☆60Updated 2 years ago
- Lighthouse is a library for data lakes built on top of Apache Spark. It provides high-level APIs in Scala to streamline data pipelines an…☆61Updated 4 months ago
- A Spark-based data comparison tool at scale which facilitates software development engineers to compare a plethora of pair combinations o…☆49Updated last year
- ☆16Updated last year
- ☆10Updated 2 years ago
- Tools for faster and optimized interaction with Teradata and large datasets.☆17Updated 6 years ago
- control spark-shell from vim☆10Updated 8 years ago
- PySpark phonetic and string matching algorithms☆37Updated 11 months ago
- type-class based data cleansing library for Apache Spark SQL☆79Updated 5 years ago
- Basic framework utilities to quickly start writing production ready Apache Spark applications☆35Updated last month
- A pyspark lib to validate data quality☆18Updated 2 years ago
- A library that brings useful functions from various modern database management systems to Apache Spark☆58Updated last year
- ☆71Updated 3 years ago
- JSON schema parser for Apache Spark☆81Updated 2 years ago
- Filling in the Spark function gaps across APIs☆50Updated 3 years ago
- How to evaluate the Quality of your Data with Great Expectations and Spark.☆29Updated last year
- How to manage Slowly Changing Dimensions with Apache Hive☆55Updated 5 years ago
- Provide functionality to build statistical models to repair dirty tabular data in Spark☆12Updated last year
- A library for exporting Spark ML models and pipelines to PFA☆54Updated 6 years ago
- Observability Python library - Powered by Kensu☆22Updated 3 months ago
- ☆15Updated 2 years ago
- Examples for High Performance Spark☆15Updated 2 months ago
- Machine Learning Pipeline Stages for Spark (exposed in Scala/Java + Python)☆74Updated last year
- Nested array transformation helper extensions for Apache Spark☆37Updated last year
- A collection of “cookbook-style” scripts for simplifying data engineering and machine learning in Apache Spark.☆13Updated 3 years ago
- Splittable SAS (.sas7bdat) Input Format for Hadoop and Spark SQL☆90Updated last year