shravan-kuchkula/udacity-data-eng-proj2

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/shravan-kuchkula/udacity-data-eng-proj2)

shravan-kuchkula / udacity-data-eng-proj2

A production-grade data pipeline has been designed to automate the parsing of user search patterns to analyze user engagement. Extract data from S3, apply a series of transformations and load into S3 and Redshift.

☆24

Alternatives and similar repositories for udacity-data-eng-proj2

Users that are interested in udacity-data-eng-proj2 are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

shravan-kuchkula / dataEngineering
View on GitHub
A repo to track data engineering projects
☆14Nov 11, 2022Updated 3 years ago
shravan-kuchkula / udacity-data-eng-proj-1
View on GitHub
Developed a data pipeline to automate data warehouse ETL by building custom airflow operators that handle the extraction, transformation,…
☆89Nov 22, 2021Updated 4 years ago
dushyantkhosla / airflow4ds
View on GitHub
Using Apache Airflow to author, run and monitor complex data pipelines.
☆12Oct 24, 2018Updated 7 years ago
MaxiDS / -Streaming_ETL_FastApi_Docker
View on GitHub
Desarrollé un proyecto de ETL sobre archivos de diferentes orígenes (CSV, JSON). Luego, utilicé FastAPI para crear una API que permita re…
☆10Dec 9, 2022Updated 3 years ago
Aaron-K-T-Berry / airflow-docker-boilerplate
View on GitHub
☆11Updated this week
1-Click AI Models by DigitalOcean Gradient • Ad
Deploy popular AI models on DigitalOcean Gradient GPU virtual machines with just a single click. Zero configuration with optimized deployments.
vsouza / spark-kinesis-redshift
View on GitHub
Example project for consuming AWS Kinesis streamming and save data on Amazon Redshift using Apache Spark
☆11May 22, 2018Updated 8 years ago
dgadiraju / nifi-workshop
View on GitHub
☆26Aug 25, 2020Updated 5 years ago
anthonywong611 / Batch-ETL-with-AWS-EMR-and-MWAA
View on GitHub
Create a data pipeline on AWS to execute batch processing in a Spark cluster provisioned by Amazon EMR. ETL using managed airflow: extrac…
☆10Jul 12, 2021Updated 5 years ago
codingvarun / streaming-elt-pipeline
View on GitHub
This is a real-life, high throughput streaming ELT data pipeline for ecommerce
☆15May 22, 2023Updated 3 years ago
johnny-chivers / emrZeroToHero
View on GitHub
Repo which holds the materials for the EMR Zero To Hero
☆28May 7, 2022Updated 4 years ago
vighc / kafka-stream
View on GitHub
Deployed an kafka instance in AWS EC2 Instance to streamline the data into Databricks
☆10Aug 15, 2023Updated 2 years ago
Narendra-Kamath / webxr-measuring-tape
View on GitHub
⚡ An Augmented Reality real-world length measuring web application built by the modification of the example being provided by babylonjs -…
☆12Sep 24, 2020Updated 5 years ago
094459 / blogpost-airflow-hybrid
View on GitHub
Repo that will help you explore how to build a hybrid workflow using Apache Airflow and Amazon ECS Anywhere
☆11Jul 12, 2022Updated 4 years ago
aaronstone007 / Udacity-Data-Streaming
View on GitHub
Projects from Udacity Data Streaming Nanodegree
☆15Aug 14, 2023Updated 2 years ago
Deploy to Railway using AI coding agents - Free Credits Offer • Ad
Use Claude Code, Codex, OpenCode, and more. Autonomous software development now has the infrastructure to match with Railway.
fhinkel / git-bisect-demo
View on GitHub
Practice git-bisect.
☆10Oct 31, 2019Updated 6 years ago
chuqiaoshen / Git-Influencer
View on GitHub
Insight Data Engineering project: A platform built in HDFS, Spark and Airflow to help you to find social influencers from GitHub Net…
☆16May 21, 2024Updated 2 years ago
kevindegila / TakeABreak
View on GitHub
A simple Python script which reminds you to take a break from your screen
☆11Sep 5, 2020Updated 5 years ago
ajupton / big-data-engineering-project
View on GitHub
Big Data Engineering practice project, including ETL with Airflow and Spark using AWS S3 and EMR
☆92Jul 17, 2019Updated 7 years ago
salman3-029 / kafka-azure-data-engineering-project
View on GitHub
☆13Dec 30, 2022Updated 3 years ago
sungchun12 / serverless-data-pipeline-gcp
View on GitHub
Schedule a data pipeline in Google Cloud using cloud function, BigQuery, cloud storage, cloud scheduler, stack trace, cloud build, and p…
☆25Jun 4, 2019Updated 7 years ago
FBosler / you-datascientist
View on GitHub
☆15Sep 20, 2019Updated 6 years ago
liquidtelecom / Golang-Training-Examples
View on GitHub
This repo contains example code used for golang training
☆10Feb 19, 2023Updated 3 years ago
InsightDataScience / anomaly_detection
View on GitHub
☆18Jul 2, 2017Updated 9 years ago
Wordpress hosting with auto-scaling - Free Trial Offer • Ad
Fully Managed hosting for WordPress and WooCommerce businesses that need reliable, auto-scalable performance. Cloudways SafeUpdates now available.
4OH4 / pytesting
View on GitHub
Demo code for testing with Pytest and Hypothesis
☆14Oct 12, 2021Updated 4 years ago
CityOfPhiladelphia / phila-airflow
View on GitHub
☆15May 31, 2017Updated 9 years ago
cpe200-160 / CalculatorLab
View on GitHub
Computer Engineering Lab for 261200
☆14May 2, 2021Updated 5 years ago
masfworld / cdc_deltaLake
View on GitHub
Docker compose and Google Colab demo to build a CDC with Delta Lake
☆15Sep 7, 2022Updated 3 years ago
aoelvp94 / Spotydash
View on GitHub
Proyecto de juguete para mostrar cómo realizar el setup de un proyecto de data science
☆11Nov 24, 2022Updated 3 years ago
sahibpreetsingh12 / 100daysofmlcode
View on GitHub
☆11May 22, 2021Updated 5 years ago
jess197 / football_statistics_etl_project
View on GitHub
☆13Dec 28, 2023Updated 2 years ago
noahgift / container-revolution-devops-microservices
View on GitHub
DevOps SKlearn Microservice
☆27Jun 21, 2022Updated 4 years ago
pybraries / pybraries
View on GitHub
Python wrapper for libraries.io API
☆18Dec 1, 2024Updated last year
Virtual machines for every use case on DigitalOcean • Ad
Get dependable uptime with 99.99% SLA, simple security tools, and predictable monthly pricing with DigitalOcean's virtual machines, called Droplets.
manceps / fashion-mnist-kfp-lab
View on GitHub
A notebook showing how to easily convert a current notebook you have to a notebook that can be run on Kubeflow Pipelines.
☆15Jul 15, 2020Updated 6 years ago
alvintoh / udemy-hands-on-hadoop
View on GitHub
AlvinToh Learning Repository for The Ultimate Hands-On Hadoop - Tame your Big Data!
☆10May 23, 2018Updated 8 years ago
weijinqian0 / feature_eda
View on GitHub
整理所有特征工程用到的方法，为了复用
☆11Jan 11, 2021Updated 5 years ago
supratim94336 / DataEngineeringCapstoneProject
View on GitHub
😈Complete End to End ETL Pipeline with Spark, Airflow, & AWS
☆51Aug 23, 2019Updated 6 years ago
jason-jz-zhu / Modularization_Python_Docker_Demo
View on GitHub
The Demo for Blog: Modularization using Python and Docker (MicroService)
☆11Feb 4, 2021Updated 5 years ago
aws-samples / amazon-redshift-streaming-workshop
View on GitHub
This repository provides the resources required for the Amazon Redshift Streaming workshop
☆13Apr 13, 2026Updated 3 months ago
macloujulian / kubernetescurso
View on GitHub
☆17Jun 27, 2020Updated 6 years ago