puckel / docker-airflow
Docker Apache Airflow
☆3,775Updated last year
Related projects ⓘ
Alternatives and complementary repositories for docker-airflow
- A series of DAGs/Workflows to help maintain the operation of Airflow☆1,680Updated 4 months ago
- Curated list of resources about Apache Airflow☆3,681Updated 2 months ago
- ETL best practices with airflow, with examples☆1,293Updated last month
- Dynamically generate Apache Airflow DAGs from YAML configuration files☆1,197Updated this week
- Guides and docs to help you get up and running with Apache Airflow.☆799Updated 2 years ago
- A docker image and kubernetes config files to run Airflow on Kubernetes☆654Updated 5 years ago
- Apache Airflow tutorial☆933Updated 2 years ago
- Amundsen is a metadata driven application for improving the productivity of data analysts, data scientists and engineers when interacting…☆4,428Updated last week
- Deequ is a library built on top of Apache Spark for defining "unit tests for data", which measure data quality in large datasets.☆3,307Updated last month
- Apache Airflow - A platform to programmatically author, schedule, and monitor workflows☆37,011Updated this week
- Airflow basics tutorial☆398Updated 3 years ago
- A curated list of awesome ETL frameworks, libraries, and software.☆3,279Updated 3 months ago
- Data quality testing for the modern data stack (SQL, Spark, and Pandas) https://www.soda.io☆1,901Updated this week
- Python interface to Hive and Presto. 🐝☆1,670Updated 3 months ago
- Jupyter magics and kernels for working with remote Spark clusters☆1,328Updated last week
- Always know what to expect from your data.☆9,970Updated this week
- A plugin for Apache Airflow that exposes rest end points for the Command Line Interfaces☆325Updated 3 years ago
- A curated list of data engineering tools for software developers☆461Updated 7 years ago
- Apache Spark docker image☆2,038Updated last year
- Code for Data Pipelines with Apache Airflow☆716Updated 2 months ago
- An open-source storage framework that enables building a Lakehouse architecture with compute engines including Spark, PrestoDB, Flink, Tr…☆7,578Updated this week
- pandas on AWS - Easy integration with Athena, Glue, Redshift, Timestream, Neptune, OpenSearch, QuickSight, Chime, CloudWatchLogs, DynamoD…☆3,926Updated this week
- Helm Charts for the Astronomer Platform, Apache Airflow as a Service on Kubernetes☆465Updated this week
- Amazon Redshift Utils contains utilities, scripts and view which are useful in a Redshift environment☆2,765Updated 3 months ago
- A plugin for Apache Airflow that allows you to edit DAGs in browser☆401Updated this week
- A lightweight opinionated ETL framework, halfway between plain scripts and Apache Airflow☆2,079Updated 10 months ago
- Collect, aggregate, and visualize a data ecosystem's metadata☆1,773Updated this week
- An orchestration platform for the development, production, and observation of data assets.☆11,649Updated this week