RealKinetic / aws-glue-pipeline-example
An example CI/CD pipeline using GitHub Actions for doing continuous deployment of AWS Glue jobs built on PySpark and Jupyter Notebooks.
☆12Updated 4 years ago
Alternatives and similar repositories for aws-glue-pipeline-example:
Users that are interested in aws-glue-pipeline-example are comparing it to the libraries listed below
- A Pyspark job to handle upserts, conversion to parquet and create partitions on S3☆26Updated 4 years ago
- Example project for consuming AWS Kinesis streamming and save data on Amazon Redshift using Apache Spark☆11Updated 6 years ago
- Snowflake Cookbook, published by Packt☆77Updated 2 years ago
- Databricks Certified Associate Spark Developer preparation toolkit to setup single node Standalone Spark Cluster along with material in t…☆29Updated 10 months ago
- ☆87Updated 2 years ago
- AWS Glue tutorial for data developers.☆23Updated 5 years ago
- code snippet for analytics sessions☆33Updated 2 years ago
- 😈Complete End to End ETL Pipeline with Spark, Airflow, & AWS☆43Updated 5 years ago
- Azure Data Engineering Cookbook 2nd-edition, published by Packt☆31Updated last year
- All the Snowflake Virtual Warehouse - Example☆11Updated 4 years ago
- ☆34Updated 2 years ago
- Repository for AWS Glue Workshop☆31Updated 2 years ago
- Demo code to illustrate the execution of PyTest unit test cases for AWS Glue jobs in AWS CodePipeline using AWS CodeBuild projects☆42Updated 3 months ago
- Udacity Data Engineer Nano Degree - Project-3 (Data Warehouse)☆22Updated 5 years ago
- Learn how to auto-ingest streaming data into Snowflake using Snowpipe.☆23Updated 2 years ago
- Serverless ETL and Analytics with AWS Glue, published by Packt☆46Updated last year
- Data Engineering with Spark and Delta Lake☆95Updated 2 years ago
- Study materials for the AWS Big Data / Data Analytics Specialty Exam☆27Updated 2 years ago
- ☆14Updated 5 years ago
- Simplifying Data Engineering and Analytics with Delta, published by Packt☆21Updated last year
- Udacity Data Engineering Nanodegree Capstone Project☆35Updated 4 years ago
- Data Engineering with AWS Cookbook, published by Packt☆14Updated 3 months ago
- ☆17Updated 4 years ago
- Project files for the post: Running PySpark Applications on Amazon EMR using Apache Airflow: Using the new Amazon Managed Workflows for A…☆40Updated 2 years ago
- ☆15Updated 3 years ago
- Repository of notebooks and related collateral used in the Databricks Demo Hub, showing how to use Databricks, Delta Lake, MLflow, and mo…☆25Updated 3 years ago
- Lab Instructions for Data Engineering Immersion Day☆186Updated 2 weeks ago
- Quickstart: Getting Started with Snowpark Python☆31Updated 2 years ago
- Code Repository for GCP: Complete Google Data Engineer and Cloud Architect Guide(v), Published by Packt☆16Updated 2 years ago
- ETL (Extract, Transform and Load) with the Spark Python API (PySpark) and Hadoop Distributed File System (HDFS)☆15Updated 6 years ago