ThiagoPanini / sparksnake
Improving the development of Spark applications deployed as jobs on AWS services like Glue and EMR
☆12Updated last year
Related projects: ⓘ
- ☆15Updated 5 months ago
- ☆14Updated 8 months ago
- app-server-migration helps in discovering the changes required to migrate the code from source server to target server and provides effor…☆15Updated 6 months ago
- ☆22Updated last year
- ☆26Updated 6 months ago
- Conteúdo das aulas da turma 6 do bootcamp de engenharia de dados da How☆12Updated 3 years ago
- It is important that credit card companies are able to recognize fraudulent credit card transactions so that customers are not charged fo…☆11Updated 3 years ago
- This repository contains example patterns for storing large objects with DynamoDB.☆11Updated 3 months ago
- ☆23Updated 2 years ago
- Spark env to Glue development☆9Updated 3 years ago
- A Pyspark job to handle upserts, conversion to parquet and create partitions on S3☆26Updated 4 years ago
- An example CI/CD pipeline using GitHub Actions for doing continuous deployment of AWS Glue jobs built on PySpark and Jupyter Notebooks.☆12Updated 3 years ago
- 🐋 Docker image for AWS Glue Spark/Python☆22Updated last year
- Deploy of Airflow 2.0 using ECS Fargate and AWS CDK.☆14Updated 2 years ago
- Data Engineering com Apache Spark☆43Updated 3 years ago
- This is an ETL application on AWS with general open sales and customer data that you can find here: https://github.com/camposvinicius/dat…☆17Updated 2 years ago
- Script para ingestão de dados do Mercado Bitcoin☆11Updated last year
- This Guidance helps customers set up an ecommerce website on WordPress.☆10Updated last year
- Git repo to accompany the AWS DevOps Blog: Using AWS DevOps Tools to model and provision AWS Glue workflows☆17Updated 2 years ago
- Build, Test and Deploy ETL solutions using AWS Glue and AWS CDK based CI/CD pipelines☆36Updated last year
- ☆22Updated 4 years ago
- This repo provides the Kubernetes Helm chart for deploying Pyspark Notebook.☆17Updated last year
- This repository contains ready-to-use notebook examples for a wide variety of use cases in Amazon EMR Studio.☆44Updated 10 months ago
- The open source version of the Amazon Redshift Getting Started Guide.☆15Updated last year
- ☆15Updated last year
- ☆10Updated 7 months ago
- ETL (Extract, Transform and Load) with the Spark Python API (PySpark) and Hadoop Distributed File System (HDFS)☆13Updated 5 years ago
- Lambda function that automatically create or update AWS resource with AWS service's IP ranges from the ip-ranges.json file. You can confi…☆14Updated 2 months ago
- Terraform module to create AWS EMR resources 🇺🇦☆23Updated last month
- This repository shows how to setup Centralized CloudWatch Observability Manager using Terraform☆15Updated 6 months ago