Joshua-omolewa / Retailstore_ETL_pipeline_project
Built a Data Pipeline for a Retail store using AWS services that collects data from its transactional database (OLTP) in Snowflake and transforms the raw data (ETL process) using Apache spark to meet business requirements and also enables Data Analyst create Data Visualization using Superset. Airflow is used to orchestrate the pipeline
☆10Updated last year
Alternatives and similar repositories for Retailstore_ETL_pipeline_project:
Users that are interested in Retailstore_ETL_pipeline_project are comparing it to the libraries listed below
- In this project, we will build and ETL(Extract,Transform,Load) pipeline using the Spotify API on AWS. The pipeline will retrieve data fro…☆21Updated last year
- ☆27Updated last year
- 😈Complete End to End ETL Pipeline with Spark, Airflow, & AWS☆44Updated 5 years ago
- ☆19Updated last year
- For this project I am creating an ETL (Extract, Transform, and Load) pipeline using Python, RegEx, and SQL Database. The goal is to retri…☆27Updated 4 years ago
- Projects done in the Data Engineer Nanodegree Program by Udacity.com☆147Updated 2 years ago
- Simple ETL pipeline using Python☆25Updated last year
- This project focuses on building a robust data pipeline using Apache Airflow to automate the ingestion of weather data from the OpenWeath…☆21Updated last year
- Data Engineering Project in GCP☆18Updated last year
- With everything I learned from DEZoomcamp from datatalks.club, this project performs a batch processing on AWS for the cycling dataset wh…☆12Updated 2 years ago
- Classwork projects and home works done through Udacity data engineering nano degree☆9Updated 3 years ago
- Classwork projects and home works done through Udacity data engineering nano degree☆74Updated last year
- This repository contains the code for a realtime election voting system. The system is built using Python, Kafka, Spark Streaming, Postgr…☆35Updated last year
- Real World Project on Formula1 Racing using Azure Databricks, Delta Lake and Azure Data Factory☆13Updated last year
- Repository for Data Engineering Interview Series☆29Updated 5 months ago
- Ravi Azure ADB ADF Repository☆65Updated last month
- PySpark functions and utilities with examples. Assists ETL process of data modeling☆100Updated 4 years ago
- This is an all-in-one repository for Data Engineers, ideal for beginners & interview preparation, which includes Python as the main Progr…☆26Updated 2 years ago
- ☆87Updated 2 years ago
- PySpark Projects☆23Updated this week
- An End-to-End ETL data pipeline that leverages pyspark parallel processing to process about 25 million rows of data coming from a SaaS ap…☆25Updated 2 years ago
- DataTalks.Club's Data Engineering Zoomcamp Project☆23Updated 2 years ago
- Solution to all projects of Udacity's Data Engineering Nanodegree: Data Modeling with Postgres & Cassandra, Data Warehouse with Redshift,…☆56Updated 2 years ago
- ☆40Updated 8 months ago
- ☆23Updated last year
- ☆49Updated last year
- Repository related to Spark SQL and Pyspark using Python3☆37Updated 2 years ago
- This repository will help you to learn about databricks concept with the help of examples. It will include all the important topics which…☆96Updated 7 months ago
- YouTube tutorial project☆100Updated last year
- I am using confluent Kafka cluster to produce and consume scraped data. In this project, I've created a real-time data pipeline that uti…☆29Updated last year