capitalone / datacompy
Pandas, Polars, Spark, and Snowpark DataFrame comparison for humans and more!
☆484Updated this week
Related projects ⓘ
Alternatives and complementary repositories for datacompy
- PySpark test helper methods with beautiful error messages☆616Updated 2 weeks ago
- Python API for Deequ☆727Updated 3 weeks ago
- pyspark methods to enhance developer productivity 📣 👯 🎉☆640Updated 3 weeks ago
- Great Expectations Airflow operator☆159Updated last week
- Learn how to add data validation and documentation to a data pipeline built with dbt and Airflow.☆166Updated last year
- Apache Airflow integration for dbt☆396Updated 5 months ago
- Delta Lake helper methods in PySpark☆304Updated 2 months ago
- locopy: Loading/Unloading to Redshift and Snowflake using Python.☆104Updated last week
- dbt adapter for SQL Server and Azure SQL☆214Updated last week
- Snowflake Snowpark Python API☆269Updated this week
- dbt (http://getdbt.com) adapter for DuckDB (http://duckdb.org)☆916Updated this week
- dbt-spark contains all of the code enabling dbt to work with Apache Spark and Databricks☆400Updated this week
- Port(ish) of Great Expectations to dbt test macros☆1,077Updated last month
- Possibly the fastest DataFrame-agnostic quality check library in town.☆171Updated this week
- Turning PySpark Into a Universal DataFrame API☆317Updated this week
- Dynamically generate Apache Airflow DAGs from YAML configuration files☆1,197Updated this week
- Create HTML profiling reports from Apache Spark DataFrames☆195Updated 4 years ago
- Snowflake SQLAlchemy☆233Updated this week
- Guides and docs to help you get up and running with Apache Airflow.☆799Updated 2 years ago
- Data quality testing for the modern data stack (SQL, Spark, and Pandas) https://www.soda.io☆1,903Updated this week
- Distributed SQL Engine in Python using Dask☆393Updated 2 months ago
- Macros that generate dbt code☆485Updated 3 weeks ago
- Turbodbc is a Python module to access relational databases via the Open Database Connectivity (ODBC) interface. The module complies with …☆621Updated this week
- Snowflake Connector for Python☆594Updated this week
- A unified interface for distributed computing. Fugue executes SQL, Python, Pandas, and Polars code on Spark, Dask and Ray without any rew…☆2,005Updated last month
- A dbt package for modelling dbt metadata. https://brooklyn-data.github.io/dbt_artifacts☆330Updated last week