gtoonstra / etl-with-airflow
ETL best practices with airflow, with examples
☆1,295Updated last month
Related projects ⓘ
Alternatives and complementary repositories for etl-with-airflow
- Guides and docs to help you get up and running with Apache Airflow.☆800Updated 2 years ago
- Dynamically generate Apache Airflow DAGs from YAML configuration files☆1,209Updated this week
- A series of DAGs/Workflows to help maintain the operation of Airflow☆1,681Updated 5 months ago
- Curated list of resources about Apache Airflow☆3,691Updated 3 months ago
- A curated list of data engineering tools for software developers☆462Updated 7 years ago
- Docker Apache Airflow☆3,776Updated last year
- Airflow basics tutorial☆397Updated 3 years ago
- A docker image and kubernetes config files to run Airflow on Kubernetes☆655Updated 5 years ago
- Apache Airflow tutorial☆934Updated 2 years ago
- Example DAGs using hooks and operators from Airflow Plugins☆333Updated 6 years ago
- A curated list of awesome ETL frameworks, libraries, and software.☆3,287Updated 3 months ago
- Helm Charts for the Astronomer Platform, Apache Airflow as a Service on Kubernetes☆465Updated this week
- Data quality testing for the modern data stack (SQL, Spark, and Pandas) https://www.soda.io☆1,913Updated this week
- Apache Airflow integration for dbt☆396Updated 6 months ago
- A plugin for Apache Airflow that exposes rest end points for the Command Line Interfaces☆325Updated 3 years ago
- A boilerplate for writing PySpark Jobs☆393Updated 10 months ago
- A plugin for Apache Airflow that allows you to edit DAGs in browser☆403Updated 2 weeks ago
- Python API for Deequ☆730Updated last month
- A lightweight opinionated ETL framework, halfway between plain scripts and Apache Airflow☆2,081Updated 11 months ago
- Port(ish) of Great Expectations to dbt test macros☆1,083Updated 2 months ago
- A Helm chart to install Apache Airflow on Kubernetes☆276Updated this week
- Airflow Unit Tests and Integration Tests☆256Updated 2 years ago
- re_data - fix data issues before your users & CEO would discover them 😊☆1,552Updated 6 months ago
- Deequ is a library built on top of Apache Spark for defining "unit tests for data", which measure data quality in large datasets.☆3,309Updated last month
- An Airflow docker image preconfigured to work well with Spark and Hadoop/EMR☆173Updated last year
- Assets related to the operation of Fishtown Analytics.☆415Updated last month
- pyspark methods to enhance developer productivity 📣 👯 🎉☆643Updated last month
- Apache Airflow in Docker Compose (for both versions 1.10.* and 2.*)☆184Updated 11 months ago
- Utility functions for dbt projects.☆1,379Updated last week
- Collect, aggregate, and visualize a data ecosystem's metadata☆1,781Updated last week