moj-analytical-services / dataengineeringutils3Links
Fully unit tested utility functions for data engineering. Python 3 only.
☆18Updated 2 weeks ago
Alternatives and similar repositories for dataengineeringutils3
Users that are interested in dataengineeringutils3 are comparing it to the libraries listed below
Sorting:
- A CLI tool to streamline getting started with Apache Airflow™ and managing multiple Airflow projects☆225Updated 9 months ago
- This repository contains the dbt-glue adapter☆139Updated 3 weeks ago
- A simple and easy to use Data Quality (DQ) tool built with Python.☆51Updated 2 years ago
- Example orchestration pipeline for Fivetran + dbt managed by Airflow☆22Updated 4 years ago
- Pytest plugin for dbt core☆63Updated last year
- A flake8 plugin that detects of usage withColumn in a loop or inside reduce☆28Updated 7 months ago
- pytest plugin to run the tests with support of pyspark☆88Updated 8 months ago
- A dbt-core python package that automates the management and creation of dbt groups, contracts, access, and versions.☆125Updated last year
- Amazon Managed Workflows for Apache Airflow (MWAA) Examples repository contains example DAGs, requirements.txt, plugins, and CloudFormati…☆118Updated 2 months ago
- Enforce Best Practices for all your Airflow DAGs. ⭐☆108Updated last week
- ☆31Updated 8 months ago
- Pipeline definitions for managing data flows to power analytics at MIT Open Learning☆45Updated this week
- A command-line interface for packaging, deploying, and running your EMR Serverless Spark jobs☆46Updated last year
- Make simple storing test results and visualisation of these in a BI dashboard☆51Updated last month
- A modern ELT demo using airbyte, dbt, snowflake and dagster☆28Updated 3 years ago
- 🥪🏭 A simple CLI for generating synthetic Jaffle Shop data.☆45Updated last month
- Utility functions for dbt projects running on Spark☆34Updated last month
- A bunch of hacks developed around dbt☆48Updated 6 years ago
- Spark runtime on AWS Lambda☆113Updated 5 months ago
- Great Expectations Airflow operator☆170Updated this week
- 🏁 A sweet and speedy code generator for dbt 🏎️✨☆32Updated last week
- ☆41Updated 8 months ago
- A dbt artifacts parser in python☆110Updated last week
- Possibly the fastest DataFrame-agnostic quality check library in town.☆234Updated 3 months ago
- A Python package to help Databricks Unity Catalog users to read and query Delta Lake tables with Polars, DuckDb, or PyArrow.☆27Updated last year
- Soda Spark is a PySpark library that helps you with testing your data in Spark Dataframes☆63Updated 3 years ago
- This repository has moved into https://github.com/dbt-labs/dbt-adapters☆106Updated 11 months ago
- Delta Lake helper methods. No Spark dependency.☆22Updated 2 weeks ago
- Pythonic Programming Framework to orchestrate jobs in Databricks Workflow☆224Updated last month
- a dbt package to make auditing dbt runs easy.☆99Updated last year