Data validation library for PySpark 3.0.0
☆33Nov 11, 2022Updated 3 years ago
Alternatives and similar repositories for check-engine
Users that are interested in check-engine are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- A pyspark lib to validate data quality☆19Nov 11, 2022Updated 3 years ago
- ☆10Feb 18, 2021Updated 5 years ago
- A tool to validate data, built around Apache Spark.☆102Updated this week
- Rocksdb state storage implementation for Structured Streaming.☆17Oct 21, 2020Updated 5 years ago
- ☆15May 31, 2023Updated 3 years ago
- Wordpress hosting with auto-scaling - Free Trial Offer • AdFully Managed hosting for WordPress and WooCommerce businesses that need reliable, auto-scalable performance. Cloudways SafeUpdates now available.
- A toolset to streamline running spark python on EMR☆20Nov 16, 2016Updated 9 years ago
- Extensible streaming ingestion pipeline on top of Apache Spark☆47Jul 17, 2025Updated 11 months ago
- Open source stack lakehouse☆25Mar 2, 2024Updated 2 years ago
- Spark Library for Bulk Loading into Cassandra☆12Apr 18, 2018Updated 8 years ago
- VSCode Dev Container template for AWS Glue jobs development☆20Jul 25, 2024Updated last year
- recipes for BASH, Docker and more☆13Aug 24, 2025Updated 9 months ago
- Filling in the Spark function gaps across APIs☆50Apr 14, 2021Updated 5 years ago
- Apache Spark with HDFS cluster within Kubernetes☆11Jul 11, 2023Updated 2 years ago
- A Gentle introduction to Machine Learning with Apache Spark☆11Mar 2, 2026Updated 3 months ago
- Deploy to Railway using AI coding agents - Free Credits Offer • AdUse Claude Code, Codex, OpenCode, and more. Autonomous software development now has the infrastructure to match with Railway.
- Data Connector SDK and samples for Power Query and Power BI☆10Jun 17, 2021Updated 5 years ago
- ☆10Jun 29, 2021Updated 4 years ago
- Asynchronous actions for PySpark☆47Dec 2, 2021Updated 4 years ago
- A Hubot script for creating quick reminders through natural language.☆11Jun 29, 2017Updated 8 years ago
- A set of widgets for Python's Orange Machine Learning to work with Apache Spark ML☆15Dec 24, 2016Updated 9 years ago
- This package takes a CSV and generates SQL statements to create a temp table and insert the data within it into the temp table. Good whe…☆16May 28, 2024Updated 2 years ago
- Gobblin is a distributed big data integration framework (ingestion, replication, compliance, retention) for batch and streaming systems.…☆11Jul 29, 2017Updated 8 years ago
- Code to demonstrate data engineering metadata & logging best practices☆21Mar 12, 2024Updated 2 years ago
- ☆25Jul 9, 2023Updated 2 years ago
- Managed hosting for WordPress and PHP on Cloudways • AdManaged hosting for WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Deploy in minutes on Cloudways by DigitalOcean.
- The one file simple bug tracking application that incorporates a kanban board.☆12Jan 31, 2014Updated 12 years ago
- Talks given about Go and Gonum delivered by Gonum developers.☆13Apr 13, 2019Updated 7 years ago
- Basic Spark utilities☆13Feb 20, 2025Updated last year
- Firewalla Scripts for APC UPS Daemon☆13Dec 13, 2020Updated 5 years ago
- Simplified ETL process in Hadoop using Apache Spark. Has complete ETL pipeline for datalake. SparkSession extensions, DataFrame validatio…☆56May 6, 2023Updated 3 years ago
- Tracebacks for Humans (in Jupyter notebooks)☆12Dec 30, 2025Updated 5 months ago
- ☆10Jun 29, 2023Updated 2 years ago
- Docker images for Exercism tracks☆16Feb 26, 2025Updated last year
- This is a mirror of https://github.com/LucaCanali/sparkMeasure - sparkMeasure is a tool for performance troubleshooting of Apache Spark w…☆16May 21, 2026Updated 3 weeks ago
- Deploy on Railway without the complexity - Free Credits Offer • AdConnect your repo and Railway handles the rest with instant previews. Quickly provision container image services, databases, and storage volumes.
- Avro Schema Shredder is a REST API that enables storage of Avro Schemas in Apache Atlas. This API enables an organization to use Apache A…☆13Jan 11, 2017Updated 9 years ago
- A Kubernetes operator to enable GitOps style deploys for Databricks resources☆16Jun 3, 2025Updated last year
- Packer Template to build a AWS Apache Cassandra AMI☆10Jan 3, 2022Updated 4 years ago
- Playing with Play and ReST - or a simple, uniform, and introspectable way to declare ReST api☆24Feb 28, 2020Updated 6 years ago
- Sample Application for Remote Kubernetes Development in Visual Studio Code with Okteto☆11Jun 2, 2026Updated 2 weeks ago
- Example to create lineage in Atlas with sqoop and spark☆14Apr 5, 2017Updated 9 years ago
- Adds a framework to enable Natural Language interactions in your Hubot scripts☆11Dec 6, 2016Updated 9 years ago