dsaidgovsg / airflow-pipeline
An Airflow docker image preconfigured to work well with Spark and Hadoop/EMR
☆173Updated last year
Related projects ⓘ
Alternatives and complementary repositories for airflow-pipeline
- A plugin to Apache Airflow to allow you to run Spark Submit Commands as an Operator☆75Updated 5 years ago
- ☆196Updated last year
- Example DAGs using hooks and operators from Airflow Plugins☆333Updated 6 years ago
- Use Airflow to move data from multiple MySQL databases to BigQuery☆99Updated 4 years ago
- A curated list of all the awesome examples, articles, tutorials and videos for Apache Airflow.☆96Updated 3 years ago
- A plugin for Apache Airflow that exposes rest end points for the Command Line Interfaces☆325Updated 3 years ago
- A process that runs in unison with Apache Airflow to control the Scheduler process to ensure High Availability☆232Updated 2 years ago
- A boilerplate for writing PySpark Jobs☆393Updated 10 months ago
- Airflow basics tutorial☆397Updated 3 years ago
- Builds Airflow DAGs from configuration files. Powers all DAGs on the Etsy Data Platform☆262Updated last year
- A guide to running Airflow on Kubernetes☆172Updated 5 years ago
- Spark style guide☆256Updated last month
- Base Docker image with just essentials: Hadoop, Hive and Spark.☆67Updated 3 years ago
- Fast iterative local development and testing of Apache Airflow workflows☆193Updated 5 months ago
- Airflow workflow management platform chef cookbook.☆68Updated 5 years ago
- Spark on Kubernetes infrastructure Helm charts repo☆199Updated 2 years ago
- Example for article Running Spark 3 with standalone Hive Metastore 3.0☆96Updated last year
- Airflow Backfill UI based plugin for existing / new Airflow environment☆66Updated 3 years ago
- CSD for Apache Airflow☆20Updated 5 years ago
- This code demonstrates the architecture featured on the AWS Big Data blog (https://aws.amazon.com/blogs/big-data/ ) which creates a concu…☆76Updated 6 years ago
- Ambari stack service for installing and managing Apache Airflow on HDP cluster☆59Updated 6 years ago
- Multiple node presto cluster on docker container☆121Updated 2 years ago
- Airflow training for the crunch conf☆105Updated 6 years ago
- Astronomer Core Docker Images☆106Updated 5 months ago
- Build configuration-driven ETL pipelines on Apache Spark☆158Updated 2 years ago
- Spark Gotchas. A subjective compilation of the Apache Spark tips and tricks☆359Updated 7 years ago
- Lightweight proxy to expose the UI of an Apache Spark cluster that is behind a firewall☆100Updated 4 years ago
- ☆245Updated 5 years ago
- Apache (Py)Spark type annotations (stub files).☆115Updated 2 years ago
- Create HTML profiling reports from Apache Spark DataFrames☆195Updated 4 years ago