spbail / data-quality-tools
Content for a talk on "The wonderful world of data quality tools in Python"
☆19Updated 3 years ago
Alternatives and similar repositories for data-quality-tools:
Users that are interested in data-quality-tools are comparing it to the libraries listed below
- A simple and easy to use Data Quality (DQ) tool built with Python.☆49Updated last year
- Code for my "Efficient Data Processing in SQL" book.☆56Updated 7 months ago
- Full stack data engineering tools and infrastructure set-up☆50Updated 4 years ago
- Evaluation Matrix for Change Data Capture☆25Updated 7 months ago
- JumpSpark - A modern cookiecutter template for pyspark projects with batteries included.☆10Updated last year
- A portable Datamart and Business Intelligence suite built with Docker, sqlmesh + dbtcore, DuckDB and Superset☆49Updated 4 months ago
- A modern ELT demo using airbyte, dbt, snowflake and dagster☆27Updated 2 years ago
- Source code for the MC technical blog post "Data Observability in Practice Using SQL"☆36Updated 8 months ago
- Fake Pandas / PySpark DataFrame creator☆46Updated last year
- Code to help generate SQL for stakeholders. Code at https://www.startdataengineering.com/post/data-democratize-llm/☆11Updated 10 months ago
- Cost Efficient Data Pipelines with DuckDB☆50Updated 7 months ago
- PySpark schema generator☆42Updated 2 years ago
- ☆17Updated 7 months ago
- This is a simple analytic project using DuckDB & dbt with air quality data.☆19Updated last year
- Utility functions for dbt projects running on Spark☆31Updated last month
- Dask integration for Snowflake☆30Updated 4 months ago
- Delta lake and filesystem helper methods☆51Updated last year
- ☆33Updated 2 weeks ago
- csv and flat-file sniffer built in Rust.☆42Updated last year
- Fully unit tested utility functions for data engineering. Python 3 only.☆15Updated 7 months ago
- ☆42Updated 2 weeks ago
- A flake8 plugin that detects of usage withColumn in a loop or inside reduce☆27Updated 2 months ago
- Collection of utility scripts to extract code so it can be upgraded to SnowFlake using the SnowConvert tool.☆13Updated last week
- A project for exploring how Great Expectations can be used to ensure data quality and validate batches within a data pipeline defined in …☆21Updated 2 years ago
- A cool simple example of functional data engineering☆33Updated 2 years ago
- PyJaws: A Pythonic Way to Define Databricks Jobs and Workflows☆43Updated 8 months ago
- ☆15Updated 10 months ago
- DuckDB with Dashboarding tools demo evidence, streamlit and rill☆16Updated last year
- Possibly the fastest DataFrame-agnostic quality check library in town.☆183Updated last week