capitalone / datacompyLinks
Pandas, Polars, Spark, and Snowpark DataFrame comparison for humans and more!
β628Updated this week
Alternatives and similar repositories for datacompy
Users that are interested in datacompy are comparing it to the libraries listed below
Sorting:
- pyspark methods to enhance developer productivity π£ π― πβ682Updated 10 months ago
- Python API for Deequβ811Updated this week
- PySpark test helper methods with beautiful error messagesβ747Updated 2 weeks ago
- Turning PySpark Into a Universal DataFrame APIβ477Updated this week
- Great Expectations Airflow operatorβ169Updated this week
- A unified interface for distributed computing. Fugue executes SQL, Python, Pandas, and Polars code on Spark, Dask and Ray without any rewβ¦β2,136Updated 3 weeks ago
- Monitor the stability of a Pandas or Spark dataframe βοΈβ510Updated 2 weeks ago
- Learn how to add data validation and documentation to a data pipeline built with dbt and Airflow.β168Updated 2 years ago
- Possibly the fastest DataFrame-agnostic quality check library in town.β234Updated 3 months ago
- Snowflake SQLAlchemyβ260Updated last week
- Delta Lake helper methods in PySparkβ326Updated last week
- Apache Airflow integration for dbtβ411Updated last year
- Dagster Labs' open-source data platform, built with Dagster.β432Updated this week
- Data quality testing for the modern data stack (SQL, Spark, and Pandas) https://www.soda.ioβ2,273Updated this week
- Pythonic Programming Framework to orchestrate jobs in Databricks Workflowβ223Updated last month
- Turbodbc is a Python module to access relational databases via the Open Database Connectivity (ODBC) interface. The module complies with β¦β652Updated 3 weeks ago
- Port(ish) of Great Expectations to dbt test macrosβ1,203Updated last year
- Read/Write pandas DataFrames with Tableau Hyper Extractsβ121Updated 3 months ago
- Snowflake Connector for Pythonβ705Updated this week
- A SQL port of python's scikit-learn preprocessing module, provided as cross-database dbt macros.β186Updated 2 years ago
- Useful macros when performing data auditsβ391Updated last week
- Generate and Visualize Data Lineage from query historyβ327Updated 2 years ago
- Making DAG construction easierβ283Updated last month
- Better SQL in Jupyter. πβ838Updated 3 weeks ago
- A CLI tool to streamline getting started with Apache Airflowβ’ and managing multiple Airflow projectsβ225Updated 8 months ago
- Astro SDK allows rapid and clean development of {Extract, Load, Transform} workflows using Python and SQL, powered by Apache Airflow.β377Updated 8 months ago
- Tutorials for Fugue - A unified interface for distributed computing. Fugue executes SQL, Python, and Pandas code on Spark and Dask withouβ¦β114Updated 2 months ago
- A lightweight Python-based tool for extracting and analyzing data column lineage for dbt projectsβ195Updated 10 months ago
- CLI that makes it easy to create, test and deploy Airflow DAGs to Astronomerβ431Updated this week
- Snowflake Snowpark Python APIβ323Updated this week