karenbajador / pyspark_greatexpectationsLinks
☆12Updated 3 years ago
Alternatives and similar repositories for pyspark_greatexpectations
Users that are interested in pyspark_greatexpectations are comparing it to the libraries listed below
Sorting:
- Repository containing various utils related to Snowflake migration at Faire.☆12Updated 2 years ago
- Spark app to merge different schemas☆22Updated 4 years ago
- New generation opensource data stack☆70Updated 3 years ago
- Cost Efficient Data Pipelines with DuckDB☆54Updated 2 months ago
- JumpSpark - A modern cookiecutter template for pyspark projects with batteries included.☆10Updated 2 years ago
- ☆10Updated 3 years ago
- Full stack data engineering tools and infrastructure set-up☆53Updated 4 years ago
- A declarative PySpark framework for row- and aggregate-level data quality validation.☆49Updated 2 weeks ago
- ☆41Updated 5 months ago
- Streamlit application to explore Snowflake Tables☆43Updated last year
- PyJaws: A Pythonic Way to Define Databricks Jobs and Workflows☆43Updated 3 weeks ago
- Utility functions for dbt projects running on Spark☆34Updated 5 months ago
- Fake Pandas / PySpark DataFrame creator☆47Updated last year
- DataOps Observability Integration Agents are part of DataKitchen's Open Source Data Observability. They connect to various ETL, ELT, BI, …☆30Updated 2 months ago
- An LLM-powered chatbot with the added context of the dbt knowledge base.☆39Updated 7 months ago
- A modern ELT demo using airbyte, dbt, snowflake and dagster☆28Updated 2 years ago
- Read Delta tables without any Spark☆47Updated last year
- Installer for DataKitchen's Open Source Data Observability Products. Data breaks. Servers break. Your toolchain breaks. Ensure your team …☆120Updated last week
- The goal of this project is to offer an AWS EMR template using Spot Fleet and On-Demand Instances that you can use quickly. Just focus on…☆28Updated 3 years ago
- Code for my "Efficient Data Processing in SQL" book.☆57Updated 11 months ago
- A write-audit-publish implementation on a data lake without the JVM☆46Updated 11 months ago
- A Snowflake GPT Demo using SqlAlchemy☆23Updated 2 years ago
- ☆17Updated 11 months ago
- reating a modern data pipeline using a combination of Terraform, AWS Lambda and S3, Snowflake, DBT, Mage AI, and Dash.☆14Updated 2 years ago
- Blog post on ETL pipelines with Airflow☆23Updated 5 years ago
- DataOps Data Quality TestGen is part of DataKitchen's Open Source Data Observability. DataOps TestGen delivers simple, fast data qualit…☆59Updated 3 weeks ago
- Support for generating modern platforms dynamically with services such as Kafka, Spark, Streamsets, HDFS, ....☆75Updated last week
- Delta-Lake, ETL, Spark, Airflow☆47Updated 2 years ago
- Sample project that use Dagster, dbt, DuckDB and Dash to visualize car and motorcycle Spanish market☆57Updated 2 years ago
- ☆75Updated 4 months ago