r-kells / prod-airflowLinks
Helping you get Airflow running in production.
☆9Updated 5 years ago
Alternatives and similar repositories for prod-airflow
Users that are interested in prod-airflow are comparing it to the libraries listed below
Sorting:
- Udacity Data Pipeline Exercises☆15Updated 5 years ago
- This repository has a collection of utilities for Glue Crawlers. These utilities come in the form of AWS CloudFormation templates or AWS …☆19Updated 3 years ago
- Blog post on ETL pipelines with Airflow☆23Updated 4 years ago
- ☆12Updated 3 years ago
- ☆18Updated 3 years ago
- Ingest tweets with Kafka. Use Spark to track popular hashtags and trendsetters for each hashtag☆29Updated 9 years ago
- PySpark phonetic and string matching algorithms☆39Updated last year
- Docker compose and Google Colab demo to build a CDC with Delta Lake☆15Updated 2 years ago
- Project files for the post: Running PySpark Applications on Amazon EMR using Apache Airflow: Using the new Amazon Managed Workflows for A…☆41Updated 2 years ago
- Simplified ETL process in Hadoop using Apache Spark. Has complete ETL pipeline for datalake. SparkSession extensions, DataFrame validatio…☆55Updated 2 years ago
- Just a boilerplate for PySpark and Flask☆35Updated 6 years ago
- event-triggered plugins for airflow☆21Updated 5 years ago
- Quickstart PySpark with Anaconda on AWS/EMR using Terraform☆47Updated 4 months ago
- Repo for all my code on the articles I post on medium☆107Updated 2 years ago
- Learn to build a data pipeline with Airflow to automate wrangling data - An Udacity Data Engineer Nano Degree Project☆8Updated 5 years ago
- ☆26Updated last year
- Build end-to-end Machine Learning pipeline to predict accessibility of playgrounds in NYC☆15Updated 4 years ago
- Examples for using Amazon SageMaker components in Kubeflow Pipelines☆22Updated 5 years ago
- An example PySpark project with pytest☆16Updated 7 years ago
- Jupyter notebooks and AWS CloudFormation template to show how Hudi, Iceberg, and Delta Lake work☆47Updated 2 years ago
- Delux Airflow deployment with Minikube☆10Updated 4 years ago
- Basic tutorial of using Apache Airflow☆36Updated 6 years ago
- Learning from multiple companies in Silicon Valley. Netflix, Facebook, Google, Startups☆16Updated 6 years ago
- Design/Implement stream/batch architecture on NYC taxi data | #DE☆25Updated 4 years ago
- This repo demonstrates how to load a sample Parquet formatted file from an AWS S3 Bucket. A python job will then be submitted to a Apach…☆19Updated 8 years ago
- Big Data Demystified meetup and blog examples☆31Updated 9 months ago
- ☆16Updated 7 years ago
- Airflow helm chart for AWS EKS☆18Updated 4 years ago
- ☆20Updated 3 years ago
- Realtime social media data analytics with Apache Spark, Python, Kafka, Pandas, etc☆51Updated 8 years ago