Spark functions to run popular phonetic and string matching algorithms
☆60Feb 22, 2022Updated 4 years ago
Alternatives and similar repositories for spark-stringmetric
Users that are interested in spark-stringmetric are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- PySpark phonetic and string matching algorithms☆41Feb 19, 2024Updated 2 years ago
- low-level helpers for Apache Spark libraries and tests☆16Dec 29, 2018Updated 7 years ago
- Speak Slack notifications and process Slack slash commands☆15Dec 20, 2018Updated 7 years ago
- Test suite to document the behavior of Spark☆21Apr 15, 2021Updated 5 years ago
- Write property based tests easily on spark dataframes☆21Jan 19, 2024Updated 2 years ago
- Managed hosting for WordPress and PHP on Cloudways • AdManaged hosting for WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Deploy in minutes on Cloudways by DigitalOcean.
- Bulletproof Apache Spark jobs with fast root cause analysis of failures.☆74Mar 14, 2021Updated 5 years ago
- ☆12Nov 2, 2024Updated last year
- Quartz Extension and utilities for cron-style scheduling in Apache Pekko☆12Dec 25, 2025Updated 5 months ago
- Essential Spark extensions and helper methods ✨😲☆767Sep 14, 2025Updated 8 months ago
- A library that brings useful functions from various modern database management systems to Apache Spark☆63Sep 4, 2023Updated 2 years ago
- Pandas helper functions☆31Feb 19, 2023Updated 3 years ago
- Collection of open-source Spark tools & frameworks that have made the data engineering and data science teams at Swoop highly productive☆191Oct 15, 2025Updated 7 months ago
- sbt plugin to allow dependency resolution and artifact publishing for gitlab☆10Mar 1, 2026Updated 2 months ago
- Expressive types for Spark.☆897May 20, 2026Updated last week
- Bare Metal GPUs on DigitalOcean Gradient AI • AdPurpose-built for serious AI teams training foundational models, running large-scale inference, and pushing the boundaries of what's possible.
- Apache Spark testing helpers (dependency free & works with Scalatest, uTest, and MUnit)☆455Apr 2, 2026Updated last month
- Scala wrapper for SnakeYAML☆103Sep 13, 2022Updated 3 years ago
- type-class based data cleansing library for Apache Spark SQL☆78Jun 23, 2019Updated 6 years ago
- A module for the decline command line parser to enable bash and zsh autocomplete☆14Aug 7, 2023Updated 2 years ago
- This project is about numbers: exact (1, e, π, 𝛙, √2, etc.), fuzzy e.g., 1836.152673426(32), or lazy e.g., cos(2π), as quantities (with …☆16Apr 30, 2026Updated 3 weeks ago
- Shed light on your data layout in order to monitor the health of your Lakehouse tables and identify when data maintenance operations shou…☆10Jul 31, 2023Updated 2 years ago
- Repository of my talk for Bayes@Lund 2017☆10Oct 4, 2017Updated 8 years ago
- Distributed Bayesian Entity Resolution in Apache Spark☆60Jun 10, 2021Updated 4 years ago
- Scala data validation library☆30Aug 14, 2016Updated 9 years ago
- GPUs on demand by Runpod - Special Offer Available • AdRun AI, ML, and HPC workloads on powerful cloud GPUs—without limits or wasted spend. Deploy GPUs in under a minute and pay by the second.
- ⚡ Live demo environment for Django Templates fully rendered in the browser, with PyScript☆12Sep 21, 2022Updated 3 years ago
- A giter8 template for Spark SBT projects☆72Mar 20, 2021Updated 5 years ago
- Predicting survival on the Titanic☆16Dec 16, 2017Updated 8 years ago
- Apache (Py)Spark type annotations (stub files).☆118Aug 17, 2022Updated 3 years ago
- Creating Debian Packages from CRAN Sources☆12Jul 1, 2020Updated 5 years ago
- A collection of Lambda related implementations, libraries, resources an useful stuff.☆15Aug 26, 2022Updated 3 years ago
- A low-dependency HTTP health check server for Scala☆14May 13, 2026Updated 2 weeks ago
- String metrics and phonetic algorithms for Scala (e.g. Dice/Sorensen, Hamming, Jaccard, Jaro, Jaro-Winkler, Levenshtein, Metaphone, N-Gr…☆492Jul 28, 2017Updated 8 years ago
- A tool to validate data, built around Apache Spark.☆102May 18, 2026Updated last week
- GPU virtual machines on DigitalOcean Gradient AI • AdGet to production fast with high-performance AMD and NVIDIA GPUs you can spin up in seconds. The definition of operational simplicity.
- Support for JDK9's Multi Release JAR Files (JEP 238)☆17Sep 5, 2024Updated last year
- ☆24May 12, 2026Updated 2 weeks ago
- Fast JSON parser/generator for Scala☆115Mar 22, 2026Updated 2 months ago
- ☆15Oct 11, 2019Updated 6 years ago
- JumpSpark - A modern cookiecutter template for pyspark projects with batteries included.☆10May 12, 2023Updated 3 years ago
- Utilities for writing tests that use Apache Spark.☆24Dec 29, 2018Updated 7 years ago
- Scala library for sketching, locality sensitive hashing, approximate similarity search and other things☆34Mar 14, 2017Updated 9 years ago