sodadata / soda-sparkLinks
Soda Spark is a PySpark library that helps you with testing your data in Spark Dataframes
☆64Updated 3 years ago
Alternatives and similar repositories for soda-spark
Users that are interested in soda-spark are comparing it to the libraries listed below
Sorting:
- A Python Library to support running data quality rules while the spark job is running⚡☆189Updated this week
- Great Expectations Airflow operator☆167Updated this week
- Apache Airflow integration for dbt☆410Updated last year
- Schema modelling framework for decentralised domain-driven ownership of data.☆256Updated last year
- Delta Lake examples☆226Updated 10 months ago
- Airflow Providers containing Deferrable Operators & Sensors from Astronomer☆149Updated this week
- Pythonic Programming Framework to orchestrate jobs in Databricks Workflow☆218Updated last month
- Delta Lake helper methods in PySpark☆325Updated 11 months ago
- Library to convert DBT manifest metadata to Airflow tasks☆48Updated last year
- Delta Lake helper methods. No Spark dependency.☆23Updated 11 months ago
- Learn how to add data validation and documentation to a data pipeline built with dbt and Airflow.☆169Updated last year
- ☆201Updated last year
- A repository of sample code to accompany our blog post on Airflow and dbt.☆175Updated 2 years ago
- Delta lake and filesystem helper methods☆51Updated last year
- dbt-spark contains all of the code enabling dbt to work with Apache Spark and Databricks☆441Updated last month
- Airflow Unit Tests and Integration Tests☆260Updated 2 years ago
- Rules based grant management for Snowflake☆40Updated 6 years ago
- A collection of Airflow operators, hooks, and utilities to elevate dbt to a first-class citizen of Airflow.☆204Updated last week
- Generate and Visualize Data Lineage from query history☆326Updated 2 years ago
- The Trino (https://trino.io/) adapter plugin for dbt (https://getdbt.com)☆246Updated last month
- A repository of sample code to show data quality checking best practices using Airflow.☆78Updated 2 years ago
- Pylint plugin for static code analysis on Airflow code☆95Updated 4 years ago
- The athena adapter plugin for dbt (https://getdbt.com)☆251Updated 6 months ago
- Enforce Best Practices for all your Airflow DAGs. ⭐☆104Updated last week
- ☆42Updated 4 years ago
- The Lakehouse Engine is a configuration driven Spark framework, written in Python, serving as a scalable and distributed engine for sever…☆260Updated last month
- ✨ A Pydantic to PySpark schema library☆100Updated last week
- Spark style guide☆262Updated 10 months ago
- Fast iterative local development and testing of Apache Airflow workflows☆202Updated last week
- A CLI tool to streamline getting started with Apache Airflow™ and managing multiple Airflow projects☆218Updated 3 months ago