RealKinetic / aws-glue-pipeline-example
An example CI/CD pipeline using GitHub Actions for doing continuous deployment of AWS Glue jobs built on PySpark and Jupyter Notebooks.
β12Updated 4 years ago
Alternatives and similar repositories for aws-glue-pipeline-example:
Users that are interested in aws-glue-pipeline-example are comparing it to the libraries listed below
- Git repo to accompany the AWS DevOps Blog: Using AWS DevOps Tools to model and provision AWS Glue workflowsβ20Updated 3 years ago
- Serverless ETL and Analytics with AWS Glue, published by Packtβ48Updated last year
- πComplete End to End ETL Pipeline with Spark, Airflow, & AWSβ45Updated 5 years ago
- code snippet for analytics sessionsβ34Updated 2 years ago
- Databricks Certified Associate Spark Developer preparation toolkit to setup single node Standalone Spark Cluster along with material in tβ¦β30Updated last year
- Data Engineering on GCPβ35Updated 2 years ago
- A Pyspark job to handle upserts, conversion to parquet and create partitions on S3β26Updated 4 years ago
- β14Updated 6 years ago
- DAS-C01 ACG/LA by Brock Tubre and John Hannaβ126Updated last year
- Snowflake Cookbook, published by Packtβ79Updated 2 years ago
- A repo to track data engineering projectsβ13Updated 2 years ago
- Study materials for the AWS Big Data / Data Analytics Specialty Examβ27Updated 3 years ago
- Repository for AWS Glue Workshopβ31Updated 2 years ago
- ETL (Extract, Transform and Load) with the Spark Python API (PySpark) and Hadoop Distributed File System (HDFS)β15Updated 6 years ago
- β17Updated 4 years ago
- Source code for the MC technical blog post "Data Observability in Practice Using SQL"β38Updated 9 months ago
- All the Snowflake Virtual Warehouse - Exampleβ12Updated 4 years ago
- PySpark Cheatsheetβ36Updated 2 years ago
- The goal of this project is to analyse the impact of Covid-19 on the Aviation industry through data engineering processes using technologβ¦β12Updated 2 years ago
- This is the first project where we worked on apache spark, In this project what we have done is that we downloaded the datasets from KAGGβ¦β18Updated 3 years ago
- This repository is for demonstrating the capability to do SQL-based UPDATES, DELETES, and INSERTS directly in the Data Lake using Amazon β¦β16Updated 3 years ago
- Data engineering with dbt, published by Packtβ77Updated last year
- Udacity Data Engineer Nano Degree - Project-3 (Data Warehouse)β22Updated 5 years ago
- Simple ETL pipeline using Pythonβ26Updated last year
- Demo code to illustrate the execution of PyTest unit test cases for AWS Glue jobs in AWS CodePipeline using AWS CodeBuild projectsβ45Updated 4 months ago
- Example project for consuming AWS Kinesis streamming and save data on Amazon Redshift using Apache Sparkβ11Updated 6 years ago
- Lab Instructions for Data Engineering Immersion Dayβ189Updated 2 months ago
- β87Updated 2 years ago
- Spark data pipeline that processes movie ratings data.β28Updated 3 weeks ago
- β128Updated 2 months ago