RealKinetic / aws-glue-pipeline-exampleLinks
An example CI/CD pipeline using GitHub Actions for doing continuous deployment of AWS Glue jobs built on PySpark and Jupyter Notebooks.
☆12Updated 4 years ago
Alternatives and similar repositories for aws-glue-pipeline-example
Users that are interested in aws-glue-pipeline-example are comparing it to the libraries listed below
Sorting:
- A Pyspark job to handle upserts, conversion to parquet and create partitions on S3☆26Updated 4 years ago
- Data Engineering on GCP☆35Updated 2 years ago
- 😈Complete End to End ETL Pipeline with Spark, Airflow, & AWS☆46Updated 5 years ago
- Resources for video demonstrations and blog posts related to DataOps on AWS☆176Updated 3 years ago
- ☆61Updated 3 years ago
- ☆34Updated 2 years ago
- Study materials for the AWS Big Data / Data Analytics Specialty Exam☆27Updated 3 years ago
- Lab Instructions for Data Engineering Immersion Day☆190Updated 3 months ago
- This repository is for demonstrating the capability to do SQL-based UPDATES, DELETES, and INSERTS directly in the Data Lake using Amazon …☆16Updated 3 years ago
- Road to Azure Data Engineer Part-II: DP-201 - Designing an Azure Data Solution☆19Updated 4 years ago
- ☆23Updated 2 years ago
- This repository contains ready-to-use notebook examples for a wide variety of use cases in Amazon EMR Studio.☆51Updated last year
- Serverless ETL and Analytics with AWS Glue, published by Packt☆48Updated last year
- code snippet for analytics sessions☆34Updated 3 years ago
- ☆17Updated 5 years ago
- ☆14Updated 6 years ago
- Repository for AWS Glue Workshop☆33Updated 2 years ago
- Snowflake Cookbook, published by Packt☆79Updated 2 years ago
- dbt / Amazon Redshift Demonstration Project☆34Updated 2 years ago
- Airflow helm chart for AWS EKS☆18Updated 4 years ago
- dbt (data build tool) projects targeting AWS analytics services (redshift, glue, emr, athena) and open table formats☆29Updated 2 years ago
- ☆87Updated 2 years ago
- Spark app to merge different schemas☆23Updated 4 years ago
- CICD pipeline that deploys a dbt image on a GKE cluster☆11Updated 3 years ago
- Udacity Data Engineer Nano Degree - Project-3 (Data Warehouse)☆22Updated 5 years ago
- Data Engineering with AWS Cookbook, published by Packt☆20Updated 6 months ago
- Demo code to illustrate the execution of PyTest unit test cases for AWS Glue jobs in AWS CodePipeline using AWS CodeBuild projects☆46Updated 6 months ago
- PySpark functions and utilities with examples. Assists ETL process of data modeling☆103Updated 4 years ago
- This project helps me to understand the core concepts of Apache Airflow. I have created custom operators to perform tasks such as staging…☆90Updated 5 years ago
- Example project for consuming AWS Kinesis streamming and save data on Amazon Redshift using Apache Spark☆11Updated 7 years ago