Fuzzy matching function in spark (https://spark-packages.org/package/itspawanbhardwaj/spark-fuzzy-matching)
☆24Dec 30, 2019Updated 6 years ago
Alternatives and similar repositories for spark-fuzzy-matching
Users that are interested in spark-fuzzy-matching are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- ☆15Aug 11, 2022Updated 3 years ago
- Spark functions to run popular phonetic and string matching algorithms☆60Feb 22, 2022Updated 4 years ago
- Scrapes job data from Glassdoor. Fast and free Glassdoor Scraper to extract all data from job listings including salaries, companies, and…☆17Dec 20, 2023Updated 2 years ago
- A simple example application to help you load a play application in to Google Kubernetes Engine☆16Jan 24, 2018Updated 8 years ago
- Predicting survival on the Titanic☆16Dec 16, 2017Updated 8 years ago
- Deploy on Railway without the complexity - Free Credits Offer • AdConnect your repo and Railway handles the rest with instant previews. Quickly provision container image services, databases, and storage volumes.
- C# Sample that demonstrates how to handle deadlettered events in Azure Event Grid.☆14Apr 4, 2022Updated 4 years ago
- This is a script that will read a Hive metastore and generate SQL Serverless create view statements.☆14Jul 2, 2020Updated 5 years ago
- A Spark connector for the Azure Common Data Model☆15May 31, 2023Updated 3 years ago
- Record matching and entity resolution at scale in Spark☆36Oct 31, 2023Updated 2 years ago
- Patterns and examples for running R code with Azure Machine Learning☆22Sep 29, 2022Updated 3 years ago
- A language detection Web Service☆53May 9, 2017Updated 9 years ago
- ☆13Feb 10, 2023Updated 3 years ago
- Implementation of TANE for experimental purposes☆15Apr 29, 2022Updated 4 years ago
- Python wrapper for a C++ Double Metaphone☆15Jan 12, 2026Updated 4 months ago
- 1-Click AI Models by DigitalOcean Gradient • AdDeploy popular AI models on DigitalOcean Gradient GPU virtual machines with just a single click. Zero configuration with optimized deployments.
- Algebird's HyperLogLog support for Apache Spark.☆10Jul 20, 2017Updated 8 years ago
- Data engineering pipeline for the household COVID-19 Infection Survey (CIS)☆10Jul 18, 2023Updated 2 years ago
- R package for weighted model metrics☆11Apr 12, 2025Updated last year
- sbt plugin for scala modules.☆14May 3, 2026Updated 3 weeks ago
- ☆14Nov 27, 2025Updated 6 months ago
- ☆35Jul 18, 2023Updated 2 years ago
- An improved Python interface to SQLite☆14Feb 4, 2023Updated 3 years ago
- Template to deploy a Data Product for data stream processing into a Data Landing Zone of the Data Management & Analytics Scenario (former…☆36Jul 17, 2023Updated 2 years ago
- Anomaly Detection Pipeline on Azure Databricks☆28Jul 29, 2019Updated 6 years ago
- End-to-end encrypted email - Proton Mail • AdSpecial offer: 40% Off Yearly / 80% Off First Month. All Proton services are open source and independently audited for security.
- A collection of Flink applications for working with Pravega streams☆12Dec 20, 2022Updated 3 years ago
- Blocking records for record linkage and data deduplication based on ANN algorithms in Python.☆20Mar 9, 2026Updated 2 months ago
- PDF to JSON, JSON to PDF and etc.☆12Apr 18, 2018Updated 8 years ago
- A Jupyter kernel for the sqlite3 shell. This project was just a proof of concept. You should probably check out xeus-SQLite: https://blog…☆15Jan 23, 2019Updated 7 years ago
- A scikit-learn-compatible module for Isolation-based anomaly detection using nearest-neighbor ensembles☆12Aug 30, 2023Updated 2 years ago
- ☆23Jan 19, 2015Updated 11 years ago
- ☆15Jun 30, 2023Updated 2 years ago
- ☆11May 5, 2023Updated 3 years ago
- Meet Rustacean GPT, an experimental project transforming OpenAi's GPT into a helpful, autonomous software engineer to support senior deve…☆14May 10, 2023Updated 3 years ago
- 1-Click AI Models by DigitalOcean Gradient • AdDeploy popular AI models on DigitalOcean Gradient GPU virtual machines with just a single click. Zero configuration with optimized deployments.
- tsellm: LLMs in SQLite and DuckDB☆26Apr 21, 2025Updated last year
- Play chatroom with Scala API☆43Apr 23, 2019Updated 7 years ago
- Repository for GitDOX, a GitHub Data-storage Online XML editor☆16Feb 1, 2026Updated 3 months ago
- An awesome list that curates the best Flet tools, tutorials, blogs and more.☆10Jan 8, 2023Updated 3 years ago
- Maven plugin for scoverage☆44May 24, 2026Updated last week
- Adds zipkin tracing instrumentation for Clojure applications☆18Mar 29, 2014Updated 12 years ago
- State of the art time series forecasting method that has the FFORMA ensemble learn from the ESRNN hybrid model and others.☆13Sep 7, 2022Updated 3 years ago