capitalone / datacompyLinks
Pandas, Polars, Spark, and Snowpark DataFrame comparison for humans and more!
☆574Updated this week
Alternatives and similar repositories for datacompy
Users that are interested in datacompy are comparing it to the libraries listed below
Sorting:
- PySpark test helper methods with beautiful error messages☆696Updated last month
- Python API for Deequ☆771Updated 2 months ago
- A unified interface for distributed computing. Fugue executes SQL, Python, Pandas, and Polars code on Spark, Dask and Ray without any rew…☆2,082Updated 2 months ago
- pyspark methods to enhance developer productivity 📣 👯 🎉☆672Updated 2 months ago
- Better SQL in Jupyter. 📊☆780Updated 2 months ago
- Agile Data Preparation Workflows made easy with Pandas, Dask, cuDF, Dask-cuDF, Vaex and PySpark☆1,511Updated 6 months ago
- Learn how to add data validation and documentation to a data pipeline built with dbt and Airflow.☆168Updated last year
- Turning PySpark Into a Universal DataFrame API☆403Updated this week
- Data quality testing for the modern data stack (SQL, Spark, and Pandas) https://www.soda.io☆2,099Updated this week
- python automatic data quality check toolkit☆283Updated 4 years ago
- Distributed SQL Engine in Python using Dask☆405Updated 9 months ago
- dbt (http://getdbt.com) adapter for DuckDB (http://duckdb.org)☆1,086Updated this week
- Turbodbc is a Python module to access relational databases via the Open Database Connectivity (ODBC) interface. The module complies with …☆636Updated last month
- Monitor the stability of a Pandas or Spark dataframe ⚙︎☆501Updated 4 months ago
- Port(ish) of Great Expectations to dbt test macros☆1,169Updated 5 months ago
- Pythonic Programming Framework to orchestrate jobs in Databricks Workflow☆216Updated 3 weeks ago
- Great Expectations Airflow operator☆164Updated last week
- Snowflake Connector for Python☆648Updated this week
- Possibly the fastest DataFrame-agnostic quality check library in town.☆190Updated last week
- Making DAG construction easier☆265Updated 2 weeks ago
- Astro SDK allows rapid and clean development of {Extract, Load, Transform} workflows using Python and SQL, powered by Apache Airflow.☆369Updated 2 weeks ago
- Snowflake SQLAlchemy☆250Updated this week
- re_data - fix data issues before your users & CEO would discover them 😊☆1,561Updated last year
- Apache Airflow integration for dbt☆404Updated last year
- Fast iterative local development and testing of Apache Airflow workflows☆201Updated last month
- SQL upsert using pandas DataFrames for PostgreSQL, SQlite and MySQL with extra features☆229Updated last year
- Schema modelling framework for decentralised domain-driven ownership of data.☆254Updated last year
- Construct Apache Airflow DAGs Declaratively via YAML configuration files☆1,303Updated this week
- Tutorials for Fugue - A unified interface for distributed computing. Fugue executes SQL, Python, and Pandas code on Spark and Dask withou…☆113Updated last year
- ☆199Updated last year