databand-ai / awesome-apache-airflowLinks
Curated list of resources about Apache Airflow
β19Updated 4 years ago
Alternatives and similar repositories for awesome-apache-airflow
Users that are interested in awesome-apache-airflow are comparing it to the libraries listed below
Sorting:
- A bunch of hacks developed around dbtβ48Updated 5 years ago
- π Run, schedule, and manage your dbt jobs using Kubernetes.β24Updated 6 years ago
- Terraform / NiFi on the Google Cloud Platformβ28Updated 7 months ago
- Any Airflow project day 1, you can spin up a local desktop Kubernetes Airflow environment AND one in Google Cloud Composer with tested daβ¦β112Updated last year
- A VS Code Extension to make it easier to manage and develop Spark jobs on EMRβ38Updated 4 months ago
- Utility functions for dbt projects running on Sparkβ34Updated 4 months ago
- dbt (data build tool) projects targeting AWS analytics services (redshift, glue, emr, athena) and open table formatsβ29Updated 2 years ago
- A CLI to manage and monitor permissions in AWS Lake Formationβ26Updated 2 years ago
- Pylint plugin for static code analysis on Airflow codeβ95Updated 4 years ago
- Quickstart PySpark with Anaconda on AWS/EMR using Terraformβ47Updated 5 months ago
- Sample code to collect Apache Iceberg metrics for table monitoringβ28Updated 10 months ago
- Automated data quality suggestions and analysis with Deequ on AWS Glueβ85Updated 2 years ago
- Project files for the post: Running PySpark Applications on Amazon EMR using Apache Airflow: Using the new Amazon Managed Workflows for Aβ¦β41Updated 2 years ago
- β21Updated 4 years ago
- Demo for GitHub Universe 2022β12Updated 2 years ago
- CICD pipeline that deploys a dbt image on a GKE clusterβ11Updated 3 years ago
- Profiles the data, validates the schema and runs data quality checks and produces a reportβ20Updated 6 years ago
- New generation opensource data stackβ68Updated 3 years ago
- Soda Spark is a PySpark library that helps you with testing your data in Spark Dataframesβ64Updated 3 years ago
- Sample configuration to deploy a modern data platform.β88Updated 3 years ago
- The open source version of the Amazon Redshift Cluster Management Guide.β48Updated 2 years ago
- A terraform module that deploys Dagster to AWS, using ECS.β36Updated 2 years ago
- π Docker image for AWS Glue Spark/Pythonβ23Updated last year
- Soda SQL and Soda Spark have been deprecated and replaced by Soda Core. docs.soda.io/soda-core/overview.htmlβ61Updated 2 years ago
- Building Json data pipeline within Snowflake using Streams and Tasksβ26Updated 5 years ago
- Faker for Snowflake!β33Updated 2 years ago
- Fast iterative local development and testing of Apache Airflow workflowsβ201Updated last month
- Run dbt serverless in the Cloud (AWS)β42Updated 5 years ago
- re_data - fix data issues before your users & CEO would discover them πβ98Updated last year
- PySpark data-pipeline testing andΒ CICDβ28Updated 4 years ago