RealKinetic / aws-glue-pipeline-example
An example CI/CD pipeline using GitHub Actions for doing continuous deployment of AWS Glue jobs built on PySpark and Jupyter Notebooks.
☆12Updated 4 years ago
Related projects ⓘ
Alternatives and complementary repositories for aws-glue-pipeline-example
- Serverless ETL and Analytics with AWS Glue, published by Packt☆45Updated last year
- Example project for consuming AWS Kinesis streamming and save data on Amazon Redshift using Apache Spark☆11Updated 6 years ago
- A Pyspark job to handle upserts, conversion to parquet and create partitions on S3☆26Updated 4 years ago
- ETL (Extract, Transform and Load) with the Spark Python API (PySpark) and Hadoop Distributed File System (HDFS)☆14Updated 5 years ago
- 😈Complete End to End ETL Pipeline with Spark, Airflow, & AWS☆43Updated 5 years ago
- code snippet for analytics sessions☆33Updated 2 years ago
- Spark data pipeline that processes movie ratings data.☆27Updated last week
- Demo code to illustrate the execution of PyTest unit test cases for AWS Glue jobs in AWS CodePipeline using AWS CodeBuild projects☆39Updated last week
- Git repo to accompany the AWS DevOps Blog: Using AWS DevOps Tools to model and provision AWS Glue workflows☆19Updated 3 years ago
- Data Engineering on GCP☆30Updated 2 years ago
- Udacity Data Engineer Nano Degree - Project-3 (Data Warehouse)☆22Updated 5 years ago
- ☆15Updated 3 years ago
- ☆113Updated last month
- All the Snowflake Virtual Warehouse - Example☆11Updated 4 years ago
- Repository for AWS Glue Workshop☆30Updated last year
- Databricks Certified Associate Spark Developer preparation toolkit to setup single node Standalone Spark Cluster along with material in t…☆28Updated 7 months ago
- Spark app to merge different schemas☆23Updated 3 years ago
- Udacity Data Engineering Nanodegree Capstone Project☆35Updated 4 years ago
- This is the first project where we worked on apache spark, In this project what we have done is that we downloaded the datasets from KAGG…☆15Updated 3 years ago
- My solutions for the Udacity Data Engineering Nanodegree☆33Updated 5 years ago
- Companion repository for the book 'Delta Lake Up and Running'☆44Updated 7 months ago
- Learn how to auto-ingest streaming data into Snowflake using Snowpipe.☆23Updated 2 years ago
- Example repo to create end to end tests for data pipeline.☆21Updated 5 months ago
- ☆34Updated last year
- Simplified ETL process in Hadoop using Apache Spark. Has complete ETL pipeline for datalake. SparkSession extensions, DataFrame validatio…☆53Updated last year
- ☆49Updated 7 months ago
- ☆14Updated 5 years ago
- Study materials for the AWS Big Data / Data Analytics Specialty Exam☆26Updated 2 years ago
- Data engineering with dbt, published by Packt☆61Updated 8 months ago