jksinghpro / docker-airflow
Docker for airflow with mysql as backend
☆13Updated 6 years ago
Alternatives and similar repositories for docker-airflow:
Users that are interested in docker-airflow are comparing it to the libraries listed below
- Dockerizing an Apache Spark Standalone Cluster☆43Updated 2 years ago
- Simplified ETL process in Hadoop using Apache Spark. Has complete ETL pipeline for datalake. SparkSession extensions, DataFrame validatio…☆53Updated last year
- Materials of the Official Helm Chart Webinar☆27Updated 3 years ago
- Airflow helm chart for AWS EKS☆18Updated 4 years ago
- Materials for the next course☆24Updated last year
- Rules based grant management for Snowflake☆40Updated 5 years ago
- A repository of sample code to show data quality checking best practices using Airflow.☆74Updated last year
- Learn how to add data validation and documentation to a data pipeline built with dbt and Airflow.☆166Updated last year
- Sample Airflow DAGs☆61Updated 2 years ago
- Built a stream processing data pipeline to get data from disparate systems into a dashboard using Kafka as an intermediary.☆28Updated last year
- Delta Lake Documentation☆48Updated 7 months ago
- Execution of DBT models using Apache Airflow through Docker Compose☆113Updated 2 years ago
- Airflow training for the crunch conf☆104Updated 6 years ago
- Streaming Synthetic Sales Data Generator: Streaming sales data generator for Apache Kafka, written in Python☆44Updated 2 years ago
- ☆19Updated 3 years ago
- ☆20Updated 5 years ago
- Project files for the post: Running PySpark Applications on Amazon EMR using Apache Airflow: Using the new Amazon Managed Workflows for A…☆41Updated 2 years ago
- Bare minimal Airflow on Kubernetes (Local, EKS, AKS)☆52Updated 4 years ago
- locopy: Loading/Unloading to Redshift and Snowflake using Python.☆106Updated this week
- Spark on Kubernetes using Helm☆34Updated 4 years ago
- Spark data pipeline that processes movie ratings data.☆27Updated this week
- CICD pipeline that deploys a dbt image on a GKE cluster☆11Updated 3 years ago
- Soda Spark is a PySpark library that helps you with testing your data in Spark Dataframes☆63Updated 2 years ago
- A full data warehouse infrastructure with ETL pipelines running inside docker on Apache Airflow for data orchestration, AWS Redshift for …☆132Updated 4 years ago
- PySpark data-pipeline testing and CICD☆28Updated 4 years ago
- ☆25Updated last year
- Resources for video demonstrations and blog posts related to DataOps on AWS☆173Updated 2 years ago