maropu/spark-data-repair-plugin

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/maropu/spark-data-repair-plugin)

maropu / spark-data-repair-plugin

Provide functionality to build statistical models to repair dirty tabular data in Spark

☆12

Alternatives and similar repositories for spark-data-repair-plugin

Users that are interested in spark-data-repair-plugin are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

j-r77 / cfddiscovery
View on GitHub
☆11Oct 31, 2019Updated 6 years ago
codocedo / tane
View on GitHub
Implementation of TANE for experimental purposes
☆15Apr 29, 2022Updated 4 years ago
HPI-Information-Systems / metanome-algorithms
View on GitHub
Source code for several Metanome data profiling algorithms
☆58May 15, 2023Updated 3 years ago
Kalli / clubster-analysis
View on GitHub
Analysis of club lineups on ResidentAdvisor.net
☆20Oct 15, 2022Updated 3 years ago
GigahexHQ / jetprobe
View on GitHub
🚀 Validation DSL for data pipelines
☆25Jun 12, 2018Updated 8 years ago
AI Agents on DigitalOcean Gradient AI Platform • Ad
Build production-ready AI agents using customizable tools or access multiple LLMs through a single endpoint. Create custom knowledge bases or connect external data.
Azure / cortana-intelligence-population-health-management
View on GitHub
Population Health Management
☆21Dec 1, 2018Updated 7 years ago
datitran / cf-demo
View on GitHub
End-to-end data science example running on Cloud Foundry
☆19Jul 19, 2016Updated 10 years ago
data-mill-cloud / data-mill
View on GitHub
A K8s-based infrastructure for analytics
☆24Jan 15, 2020Updated 6 years ago
jeremywho / pluginTest
View on GitHub
Simple Go 1.8 plugin test for https://jeremywho.com/go-1.8---plugins/
☆10Feb 28, 2017Updated 9 years ago
amChristonasis / Twitter-Sentiment-Analysis-in-Python
View on GitHub
Architecture of Streaming Twitter Data into Apache Kafka cluster, performing simple sentiment analysis with afinn module, storing the dat…
☆20Jan 3, 2020Updated 6 years ago
reactore / generic-akka-http-rest
View on GitHub
☆11Oct 17, 2016Updated 9 years ago
fastai / docments
View on GitHub
Document parameters using comments
☆10Aug 6, 2021Updated 4 years ago
dubeyrupesh / Gatling-Kinesis
View on GitHub
This project aims at doing performance testing of AWS Kinesis stream
☆11May 16, 2020Updated 6 years ago
woodruffw / libbdiff
View on GitHub
A library for creating and patching binary diffs. Based on bsdiff.
☆11Nov 23, 2014Updated 11 years ago
1-Click AI Models by DigitalOcean Gradient • Ad
Deploy popular AI models on DigitalOcean Gradient GPU virtual machines with just a single click. Zero configuration with optimized deployments.
go-daq / smbus
View on GitHub
smbus provides access to the System Management bus over I2C
☆15Dec 16, 2020Updated 5 years ago
scalingpythonml / scaling-python-with-dask
View on GitHub
A work-in-progress book on Dask
☆12Jul 15, 2023Updated 3 years ago
yennanliu / utility_shell
View on GitHub
Collection of shell/Bash scripts for various using cases | #SE
☆11Jul 10, 2026Updated 2 weeks ago
donatellosantoro / Llunatic
View on GitHub
The Llunatic Mapping and Cleaning Chase Engine
☆38Jan 12, 2024Updated 2 years ago
chinaran / go-httpbin
View on GitHub
A reasonably complete and well-tested golang port of httpbin, with zero dependencies outside the go stdlib.
☆11Nov 24, 2025Updated 8 months ago
UBOdin / jitd
View on GitHub
Just in Time Datastructures
☆11Feb 21, 2017Updated 9 years ago
treasure-data / td-jdbc
View on GitHub
JDBC Driver for Treasure Data
☆11May 1, 2024Updated 2 years ago
corriebar / statrethinking_reading_group
View on GitHub
Material for the Berlin Bayesian reading group covering Statistical Rethinking by Richard McElreath
☆10May 7, 2020Updated 6 years ago
nguqtruong / tiki-price-watch
View on GitHub
Theo dõi biến động giá sản phẩm TIKI với Github Actions
☆14Jan 16, 2022Updated 4 years ago
Deploy to Railway using AI coding agents - Free Credits Offer • Ad
Use Claude Code, Codex, OpenCode, and more. Autonomous software development now has the infrastructure to match with Railway.
wizhidev / struct_to_json
View on GitHub
C 结构体与 JSON 快速互转库
☆10Nov 27, 2017Updated 8 years ago
loki-os / go-cip
View on GitHub
A complete golang implementation of Common industrial protocol
☆11Dec 26, 2020Updated 5 years ago
sjrd / scalajs-reflect
View on GitHub
Proof of concept of a "linker plugin" enabling some reflection for Scala.js
☆12Oct 18, 2016Updated 9 years ago
potix2 / spark-google-spreadsheets
View on GitHub
Google Spreadsheets datasource for SparkSQL and DataFrames
☆58Jul 24, 2023Updated 3 years ago
PacktPublishing / Machine-Learning-Quick-Reference
View on GitHub
Machine Learning Quick Reference, published by Packt
☆17Jan 30, 2023Updated 3 years ago
scala-infer / scala-infer
View on GitHub
Scala embedded universal probabilistic programming language
☆11Apr 15, 2021Updated 5 years ago
sparkleondata / GenericSpark
View on GitHub
GenericSpark
☆10Jun 12, 2015Updated 11 years ago
VicaYang / THU-Exam-LaTeX-Template
View on GitHub
A LaTex Template for THU Examination
☆41Aug 16, 2022Updated 3 years ago
vvvy / rust-hdfs-native
View on GitHub
Native HDFS client for Rust
☆13Nov 11, 2018Updated 7 years ago
Deploy on Railway without the complexity - Free Credits Offer • Ad
Connect your repo and Railway handles the rest with instant previews. Quickly provision container image services, databases, and storage volumes.
ttscoff / starter-book
View on GitHub
An online book.
☆11Jan 24, 2015Updated 11 years ago
yaooqinn / itachi
View on GitHub
A library that brings useful functions from various modern database management systems to Apache Spark
☆63Sep 4, 2023Updated 2 years ago
Aiven-Open / aiven-mysql-migrate
View on GitHub
MySQL® migration tool
☆13Updated this week
CausalML / interventions-disparate-impact-responders
View on GitHub
Assessing Disparate Impacts of Personalized Interventions: Identifiability and Bounds
☆11Oct 28, 2019Updated 6 years ago
soosinha / opensearch-analysis-ik
View on GitHub
The IK Analysis plugin integrates Lucene IK analyzer into elasticsearch, support customized dictionary.
☆12May 7, 2021Updated 5 years ago
scala-js / scala-js-env-jsdom-nodejs
View on GitHub
Node.js with jsdom environment for Scala.js
☆13Oct 9, 2025Updated 9 months ago
johnynek / unuhi
View on GitHub
☆10Sep 5, 2018Updated 7 years ago