judeleonard / Prescriber-ETL-data-pipeline

An End-to-End ETL data pipeline that leverages pyspark parallel processing to process about 25 million rows of data coming from a SaaS application using Apache Airflow as an orchestration tool and various data warehouse technologies and finally using Apache Superset to connect to DWH for generating BI dashboards for weekly reports
26Updated last year

Related projects

Alternatives and complementary repositories for Prescriber-ETL-data-pipeline