capitalone / datacompyLinks
Pandas, Polars, Spark, and Snowpark DataFrame comparison for humans and more!
☆604Updated last week
Alternatives and similar repositories for datacompy
Users that are interested in datacompy are comparing it to the libraries listed below
Sorting:
- Python API for Deequ☆797Updated 6 months ago
- PySpark test helper methods with beautiful error messages☆717Updated 3 weeks ago
- Monitor the stability of a Pandas or Spark dataframe ⚙︎☆505Updated last month
- Learn how to add data validation and documentation to a data pipeline built with dbt and Airflow.☆169Updated last year
- Possibly the fastest DataFrame-agnostic quality check library in town.☆220Updated this week
- pyspark methods to enhance developer productivity 📣 👯 🎉☆674Updated 7 months ago
- Pythonic Programming Framework to orchestrate jobs in Databricks Workflow☆219Updated 2 months ago
- Turning PySpark Into a Universal DataFrame API☆434Updated this week
- Snowflake SQLAlchemy☆258Updated 3 weeks ago
- Snowflake Snowpark Python API☆315Updated this week
- Great Expectations Airflow operator☆167Updated last week
- Apache Airflow integration for dbt☆406Updated last year
- Delta Lake helper methods in PySpark☆324Updated last year
- Tutorials for Fugue - A unified interface for distributed computing. Fugue executes SQL, Python, and Pandas code on Spark and Dask withou…☆114Updated last year
- Distributed SQL Engine in Python using Dask☆407Updated last year
- A unified interface for distributed computing. Fugue executes SQL, Python, Pandas, and Polars code on Spark, Dask and Ray without any rew…☆2,117Updated 6 months ago
- Create HTML profiling reports from Apache Spark DataFrames☆198Updated 5 years ago
- Data quality testing for the modern data stack (SQL, Spark, and Pandas) https://www.soda.io☆2,182Updated this week
- A SQL port of python's scikit-learn preprocessing module, provided as cross-database dbt macros.☆186Updated 2 years ago
- Snowflake Connector for Python☆686Updated last week
- Read/Write pandas DataFrames with Tableau Hyper Extracts☆121Updated this week
- Astro SDK allows rapid and clean development of {Extract, Load, Transform} workflows using Python and SQL, powered by Apache Airflow.☆375Updated 4 months ago
- CLI that makes it easy to create, test and deploy Airflow DAGs to Astronomer☆415Updated this week
- locopy: Loading/Unloading to Redshift and Snowflake using Python.☆113Updated 2 months ago
- Agile Data Preparation Workflows made easy with Pandas, Dask, cuDF, Dask-cuDF, Vaex and PySpark☆1,519Updated 10 months ago
- Making DAG construction easier☆272Updated last week
- Dagster Labs' open-source data platform, built with Dagster.☆401Updated this week
- PyAirbyte brings the power of Airbyte to every Python developer.☆300Updated last week
- Soda SQL and Soda Spark have been deprecated and replaced by Soda Core. docs.soda.io/soda-core/overview.html☆62Updated 2 years ago
- ✨ A Pydantic to PySpark schema library☆106Updated this week