vsouza / spark-kinesis-redshiftLinks

Example project for consuming AWS Kinesis streamming and save data on Amazon Redshift using Apache Spark

☆11

Alternatives and similar repositories for spark-kinesis-redshift

Users that are interested in spark-kinesis-redshift are comparing it to the libraries listed below

Sorting:

ShafiqaIqbal / AWS-Glue-Pyspark-ETL-Job
A Pyspark job to handle upserts, conversion to parquet and create partitions on S3
☆26Updated 4 years ago
mehd-io / pyspark-boilerplate-mehdio
Pyspark boilerplate for running prod ready data pipeline
☆28Updated 4 years ago
shravan-kuchkula / udacity-data-eng-proj4
Developed an ETL pipeline for a Data Lake that extracts data from S3, processes the data using Spark, and loads the data back into S3 as …
☆16Updated 5 years ago
mikaelahonen-solita / aws-glue-tutorial
AWS Glue tutorial for data developers.
☆23Updated 5 years ago
vim89 / datapipelines-essentials-python
Simplified ETL process in Hadoop using Apache Spark. Has complete ETL pipeline for datalake. SparkSession extensions, DataFrame validatio…
☆55Updated 2 years ago
BenSchr / Udacity-Data-Engineering-Projects
My solutions for the Udacity Data Engineering Nanodegree
☆34Updated 5 years ago
jamesweakley / snowflake-rbgm
Rules based grant management for Snowflake
☆40Updated 6 years ago
RealKinetic / aws-glue-pipeline-example
An example CI/CD pipeline using GitHub Actions for doing continuous deployment of AWS Glue jobs built on PySpark and Jupyter Notebooks.
☆12Updated 4 years ago
aws-samples / redshift-etl-automation-with-dbt
☆34Updated 2 years ago
garystafford / aws-airflow-demo
Project files for the post: Running PySpark Applications on Amazon EMR using Apache Airflow: Using the new Amazon Managed Workflows for A…
☆41Updated 2 years ago
sungchun12 / dbt_bigquery_example
How to Automate SQL: dbt(data build tool) tutorial on bigquery with extensive NOTES
☆32Updated last year
akashmehta10 / profiling_pyspark
☆26Updated last year
shravan-kuchkula / dataEngineering
A repo to track data engineering projects
☆13Updated 2 years ago
shravan-kuchkula / udacity-data-eng-proj-1
Developed a data pipeline to automate data warehouse ETL by building custom airflow operators that handle the extraction, transformation,…
☆90Updated 3 years ago
shravan-kuchkula / udacity-data-eng-proj2
A production-grade data pipeline has been designed to automate the parsing of user search patterns to analyze user engagement. Extract d…
☆24Updated 3 years ago
arverma / TowardsDataEngineering
This repo contains commands that data engineers use in day to day work.
☆61Updated 2 years ago
garystafford / tickit-data-lake-demo
Resources for video demonstrations and blog posts related to DataOps on AWS
☆178Updated 3 years ago
PacktPublishing / Snowflake-Cookbook
Snowflake Cookbook, published by Packt
☆80Updated 2 years ago
rvilla87 / ETL-PySpark
ETL (Extract, Transform and Load) with the Spark Python API (PySpark) and Hadoop Distributed File System (HDFS)
☆17Updated 6 years ago
Snowflake-Labs / sfgrantreport
Snowflake Grant Report offers a way of visualizing role hierarchy and rapid diagnosis of as-is permissions, giving customers insight with…
☆75Updated 2 years ago
yennanliu / spark-etl-pipeline
Various data stream/batch process demo with Apache Scala Spark 🚀
☆11Updated 5 years ago
soyelherein / pyspark-cicd-template
PySpark data-pipeline testing and CICD
☆28Updated 4 years ago
aws-samples / amazon-mwaa-examples
Amazon Managed Workflows for Apache Airflow (MWAA) Examples repository contains example DAGs, requirements.txt, plugins, and CloudFormati…
☆116Updated 7 months ago
dacort / ci-cd-serverless-spark
Demo for GitHub Universe 2022
☆12Updated 2 years ago
klescosia / aws-glue-delta-lake
This repository is for demonstrating the capability to do SQL-based UPDATES, DELETES, and INSERTS directly in the Data Lake using Amazon …
☆16Updated 3 years ago
idealo / terraform-emr-pyspark
Quickstart PySpark with Anaconda on AWS/EMR using Terraform
☆47Updated 5 months ago
jaehyeon-kim / dbt-on-aws
dbt (data build tool) projects targeting AWS analytics services (redshift, glue, emr, athena) and open table formats
☆29Updated 2 years ago
marclamberti / airflow-materials-aws
Materials for the next course
☆24Updated 2 years ago
Modingwa / Data-Engineering-Capstone-Project
Udacity Data Engineering Nanodegree Capstone Project
☆36Updated 5 years ago
aws-samples / emr-studio-notebook-examples
This repository contains ready-to-use notebook examples for a wide variety of use cases in Amazon EMR Studio.
☆51Updated last year