judeleonard / Prescriber-ETL-data-pipelineLinks
An End-to-End ETL data pipeline that leverages pyspark parallel processing to process about 25 million rows of data coming from a SaaS application using Apache Airflow as an orchestration tool and various data warehouse technologies and finally using Apache Superset to connect to DWH for generating BI dashboards for weekly reports
β25Updated 2 years ago
Alternatives and similar repositories for Prescriber-ETL-data-pipeline
Users that are interested in Prescriber-ETL-data-pipeline are comparing it to the libraries listed below
Sorting:
- Data pipeline performing ETL to AWS Redshift using Spark, orchestrated with Apache Airflowβ146Updated 4 years ago
- πComplete End to End ETL Pipeline with Spark, Airflow, & AWSβ46Updated 5 years ago
- With everything I learned from DEZoomcamp from datatalks.club, this project performs a batch processing on AWS for the cycling dataset whβ¦β14Updated 3 years ago
- End to end data engineering project with kafka, airflow, spark, postgres and docker.β95Updated 2 months ago
- β87Updated 2 years ago
- β40Updated 11 months ago
- Glue ETL job or EMR Spark that gets from data catalog, modifies and uploads to S3 and Data Catalogβ11Updated last year
- β12Updated 4 years ago
- This repository will help you to learn about databricks concept with the help of examples. It will include all the important topics whichβ¦β98Updated 10 months ago
- β28Updated last year
- β132Updated 3 months ago
- PySpark Cheatsheetβ36Updated 2 years ago
- Databricks Certified Associate Spark Developer preparation toolkit to setup single node Standalone Spark Cluster along with material in tβ¦β30Updated last year
- β21Updated 2 years ago
- Simple ETL pipeline using Pythonβ26Updated 2 years ago
- Ravi Azure ADB ADF Repositoryβ66Updated 4 months ago
- Code for "Advanced data transformations in SQL" free live workshopβ81Updated last month
- In this project, we setup and end to end data engineering using Apache Spark, Azure Databricks, Data Build Tool (DBT) using Azure as our β¦β30Updated last year
- Code for blog at https://www.startdataengineering.com/post/python-for-de/β77Updated 11 months ago
- Classwork projects and home works done through Udacity data engineering nano degreeβ74Updated last year
- β64Updated last week
- Code for dbt tutorialβ157Updated last year
- End to end data engineering projectβ56Updated 2 years ago
- Realtime Data Engineering Projectβ30Updated 4 months ago
- β38Updated 2 years ago
- Sample project to demonstrate data engineering best practicesβ191Updated last year
- This repo contains "Databricks Certified Data Engineer Professional" Questions and related docs.β74Updated 9 months ago
- Recohut - Learn data engineering, data scienceβ97Updated last year
- β51Updated last year
- Projects done in the Data Engineer Nanodegree Program by Udacity.comβ160Updated 2 years ago