minhky2185 / healthcare_data_pipelineLinks
An end-to-end data pipeline for building Data Lake and supporting report using Apache Spark.
☆16Updated 2 years ago
Alternatives and similar repositories for healthcare_data_pipeline
Users that are interested in healthcare_data_pipeline are comparing it to the libraries listed below
Sorting:
- A batch processing data pipeline, using AWS resources (S3, EMR, Redshift, EC2, IAM), provisioned via Terraform, and orchestrated from loc…☆23Updated 3 years ago
- This project provides a comprehensive data pipeline solution to extract, transform, and load (ETL) Reddit data into a Redshift data wareh…☆201Updated 2 years ago
- This repository contains the code for a realtime election voting system. The system is built using Python, Kafka, Spark Streaming, Postgr…☆45Updated 2 years ago
- YouTube tutorial project☆107Updated 2 years ago
- ☆45Updated last year
- ☆70Updated last week
- Produce Kafka messages, consume them and upload into Cassandra, MongoDB.☆43Updated 2 years ago
- Simple ETL pipeline using Python☆29Updated 2 years ago
- The goal of this project is to track the expenses of Uber Rides and Uber Eats through data Engineering processes using technologies such …☆123Updated 3 years ago
- Resources and projects from Udacity Data Engineering with AWS nano degree programme☆28Updated 2 years ago
- Get data from API, run a scheduled script with Airflow, send data to Kafka and consume with Spark, then write to Cassandra☆144Updated 2 years ago
- ☆48Updated last year
- ☆212Updated 2 years ago
- End to end data engineering project with kafka, airflow, spark, postgres and docker.☆107Updated 2 weeks ago
- This repo contains all code and data for WWCode Python DE workshop Aug 18 and 25 2022☆25Updated 3 years ago
- ☆147Updated 2 years ago
- This repo contains my projects from the Udacity Data Engineering Nano degree☆13Updated 2 years ago
- ☆317Updated last year
- Price Crawler - Tracking Price Inflation☆189Updated 5 years ago
- Data Engineering and Data Analysis, as a new hire being tested by the boss, using SQL databases.☆15Updated 2 years ago
- Data pipeline performing ETL to AWS Redshift using Spark, orchestrated with Apache Airflow☆160Updated 5 years ago
- This is the first project where we worked on apache spark, In this project what we have done is that we downloaded the datasets from KAGG…☆22Updated 4 years ago
- This repo contains all the code used in the Python for Data Engineering Course☆332Updated last year
- An end-to-end data engineering pipeline that fetches real-time YouTube analytics and streams them through Kafka for processing with ksqlD…☆16Updated 2 years ago
- My current data engineering portfolio. Includes projects spanning ETL, orchestration and dashboarding.☆118Updated last year
- ☆21Updated 2 years ago
- ☆88Updated 3 years ago
- Data Engineering YouTube Analysis Project by Darshil Parmar☆219Updated 2 years ago
- PySpark functions and utilities with examples. Assists ETL process of data modeling☆104Updated 5 years ago
- ☆89Updated last year