marclamberti / airflow-eks-helm-chart
Airflow helm chart for AWS EKS
☆18Updated 3 years ago
Related projects ⓘ
Alternatives and complementary repositories for airflow-eks-helm-chart
- Resources for video demonstrations and blog posts related to DataOps on AWS☆170Updated 2 years ago
- Materials for the next course☆22Updated last year
- A workspace to experiment with Apache Spark, Livy, and Airflow in a Docker environment.☆39Updated 3 years ago
- ☆34Updated last year
- Developed a data pipeline to automate data warehouse ETL by building custom airflow operators that handle the extraction, transformation,…☆89Updated 2 years ago
- Simplified ETL process in Hadoop using Apache Spark. Has complete ETL pipeline for datalake. SparkSession extensions, DataFrame validatio…☆53Updated last year
- Quickstart PySpark with Anaconda on AWS/EMR using Terraform☆47Updated 11 months ago
- Airflow training for the crunch conf☆105Updated 6 years ago
- Project files for the post: Running PySpark Applications on Amazon EMR using Apache Airflow: Using the new Amazon Managed Workflows for A…☆41Updated 2 years ago
- Pyspark boilerplate for running prod ready data pipeline☆28Updated 3 years ago
- Spark ETL example processing New York taxi rides public dataset on EKS☆44Updated last year
- Spark data pipeline that processes movie ratings data.☆27Updated this week
- Project files for the post: Running PySpark Applications on Amazon EMR: Methods for Interacting with PySpark on Amazon Elastic MapReduce.☆38Updated 2 years ago
- Materials for the course The Complete Hands-On Introduction to Apache Airflow☆29Updated last year
- Jupyter notebooks and AWS CloudFormation template to show how Hudi, Iceberg, and Delta Lake work☆47Updated 2 years ago
- This repository is for demonstrating the capability to do SQL-based UPDATES, DELETES, and INSERTS directly in the Data Lake using Amazon …☆16Updated 3 years ago
- dbt (data build tool) projects targeting AWS analytics services (redshift, glue, emr, athena) and open table formats☆25Updated last year
- Study materials for the AWS Big Data / Data Analytics Specialty Exam☆26Updated 2 years ago
- Demo for GitHub Universe 2022☆12Updated last year
- Spark runtime on AWS Lambda☆94Updated 2 months ago
- (project & tutorial) dag pipeline tests + ci/cd setup☆85Updated 3 years ago
- A repository of sample code to show data quality checking best practices using Airflow.☆72Updated last year
- Streaming Synthetic Sales Data Generator: Streaming sales data generator for Apache Kafka, written in Python☆44Updated last year
- Repository used for Spark Trainings☆53Updated last year
- Simple repo to demonstrate how to submit a spark job to EMR from Airflow☆32Updated 4 years ago
- Example code for running Spark and Hive jobs on EMR Serverless.☆153Updated last week
- ☆14Updated 5 years ago
- A repository of sample code to accompany our blog post on Airflow and dbt.☆167Updated last year
- Data lake, data warehouse on GCP☆54Updated 2 years ago