capitalone / datacompy
Pandas, Polars, Spark, and Snowpark DataFrame comparison for humans and more!
☆502Updated last week
Alternatives and similar repositories for datacompy:
Users that are interested in datacompy are comparing it to the libraries listed below
- Learn how to add data validation and documentation to a data pipeline built with dbt and Airflow.☆166Updated last year
- Turbodbc is a Python module to access relational databases via the Open Database Connectivity (ODBC) interface. The module complies with …☆626Updated last week
- Data quality testing for the modern data stack (SQL, Spark, and Pandas) https://www.soda.io☆1,982Updated 2 weeks ago
- Great Expectations Airflow operator☆160Updated 3 months ago
- Python API for Deequ☆739Updated 3 months ago
- PySpark test helper methods with beautiful error messages☆656Updated 2 weeks ago
- pyspark methods to enhance developer productivity 📣 👯 🎉☆658Updated last month
- Turning PySpark Into a Universal DataFrame API☆354Updated this week
- dbt (http://getdbt.com) adapter for DuckDB (http://duckdb.org)☆972Updated last week
- Snowflake Connector for Python☆612Updated this week
- Port(ish) of Great Expectations to dbt test macros☆1,127Updated last month
- Apache Airflow integration for dbt☆401Updated 8 months ago
- locopy: Loading/Unloading to Redshift and Snowflake using Python.☆106Updated this week
- Distributed SQL Engine in Python using Dask☆398Updated 5 months ago
- Monitor the stability of a Pandas or Spark dataframe ⚙︎☆499Updated this week
- Create HTML profiling reports from Apache Spark DataFrames☆195Updated 4 years ago
- dbt-spark contains all of the code enabling dbt to work with Apache Spark and Databricks☆414Updated 2 weeks ago
- python automatic data quality check toolkit☆284Updated 4 years ago
- Soda Spark is a PySpark library that helps you with testing your data in Spark Dataframes☆63Updated 2 years ago
- Dagster Labs' open-source data platform, built with Dagster.☆304Updated this week
- A SQL port of python's scikit-learn preprocessing module, provided as cross-database dbt macros.☆183Updated last year
- ☆197Updated last year
- Generate and Visualize Data Lineage from query history☆316Updated last year
- Possibly the fastest DataFrame-agnostic quality check library in town.☆180Updated this week
- Astro SDK allows rapid and clean development of {Extract, Load, Transform} workflows using Python and SQL, powered by Apache Airflow.☆361Updated last week
- Read/Write pandas DataFrames with Tableau Hyper Extracts☆117Updated 2 months ago
- Agile Data Preparation Workflows made easy with Pandas, Dask, cuDF, Dask-cuDF, Vaex and PySpark☆1,492Updated last month
- Type System for Data Analysis in Python☆210Updated 5 months ago
- A unified interface for distributed computing. Fugue executes SQL, Python, Pandas, and Polars code on Spark, Dask and Ray without any rew…☆2,038Updated 4 months ago
- A web frontend for scheduling Jupyter notebook reports☆253Updated 2 months ago