apache / airflow-ci
Apache Airflow CI pipeline
☆18Updated 5 years ago
Related projects ⓘ
Alternatives and complementary repositories for airflow-ci
- AWS bootstrap scripts for Mozilla's flavoured Spark setup.☆47Updated 4 years ago
- A library that brings useful functions from various modern database management systems to Apache Spark☆56Updated last year
- Scalable CDC Pattern Implemented using PySpark☆18Updated 5 years ago
- An example PySpark project with pytest☆17Updated 7 years ago
- ☆24Updated 4 years ago
- Skeleton project for Apache Airflow training participants to work on.☆16Updated 4 years ago
- ☆13Updated last week
- event-triggered plugins for airflow☆21Updated 4 years ago
- Provide functionality to build statistical models to repair dirty tabular data in Spark☆12Updated last year
- Quickly get a kubernetes executor airflow environment provisioned on GKE. Azure Kubernetes Service instructions included also as are inst…☆36Updated 4 years ago
- Example Spark applications that run on Kubernetes and access GCP products, e.g., GCS, BigQuery, and Cloud PubSub☆36Updated 6 years ago
- Export Airflow metrics (from mysql) in prometheus format☆29Updated 2 years ago
- Amundsen Gremlin☆20Updated 2 years ago
- A plugin to Apache Airflow to allow you to run Spark Submit Commands as an Operator☆75Updated 5 years ago
- PySpark for ETL jobs including lineage to Apache Atlas in one script via code inspection☆18Updated 7 years ago
- Cloud Spanner Connector for Apache Spark☆17Updated last month
- Pylint plugin for static code analysis on Airflow code☆90Updated 4 years ago
- Visualize dependencies between Airflow DAGs☆49Updated 3 years ago
- Rokku project. This project acts as a proxy on top of any S3 storage solution providing services like authentication, authorization, shor…☆66Updated 8 months ago
- Oozie Workflow to Airflow DAGs migration tool☆87Updated 2 weeks ago
- Paper: A Zero-rename committer for object stores☆20Updated 3 years ago
- Dione - a Spark and HDFS indexing library☆50Updated 7 months ago
- Shunting Yard is a real-time data replication tool that copies data between Hive Metastores.☆20Updated 3 years ago
- hive_compared_bq compares/validates 2 (SQL like) tables, and graphically shows the rows/columns that are different.☆28Updated 6 years ago
- type-class based data cleansing library for Apache Spark SQL☆79Updated 5 years ago
- ETLy is an add-on dashboard service on top of Apache Airflow.☆69Updated last year
- Composable filesystem hooks and operators for Apache Airflow.☆17Updated 3 years ago