Built a stream processing data pipeline to get data from disparate systems into a dashboard using Kafka as an intermediary.
☆29Aug 14, 2023Updated 2 years ago
Alternatives and similar repositories for udacity-data-eng-proj3
Users that are interested in udacity-data-eng-proj3 are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- A repo to track data engineering projects☆13Nov 11, 2022Updated 3 years ago
- A production-grade data pipeline has been designed to automate the parsing of user search patterns to analyze user engagement. Extract d…☆24Nov 22, 2021Updated 4 years ago
- Developed an ETL pipeline for a Data Lake that extracts data from S3, processes the data using Spark, and loads the data back into S3 as …☆17Oct 1, 2019Updated 6 years ago
- Developed a data pipeline to automate data warehouse ETL by building custom airflow operators that handle the extraction, transformation,…☆89Nov 22, 2021Updated 4 years ago
- A Kafka aggregator based on the Faust Python Stream Processing library☆10Apr 10, 2023Updated 3 years ago
- 1-Click AI Models by DigitalOcean Gradient • AdDeploy popular AI models on DigitalOcean Gradient GPU virtual machines with just a single click. Zero configuration with optimized deployments.
- Python Rest Client to interact against Schema Registry confluent server☆179Nov 24, 2025Updated 5 months ago
- End-to-end ELT data engineering project☆23Dec 24, 2022Updated 3 years ago
- This repo demonstrates how to load a sample Parquet formatted file from an AWS S3 Bucket. A python job will then be submitted to a Apach…☆19Jun 23, 2016Updated 9 years ago
- NoSQL extract, transform, load (ETL) toolkit with Python☆16Apr 26, 2026Updated last week
- A Cookiecutter template for creating Faust projects quickly.☆70Dec 1, 2022Updated 3 years ago
- Stock Market Data Fetching Project(US/China)☆10Mar 10, 2021Updated 5 years ago
- DataTalks.Club's Data Engineering Zoomcamp Project☆24Jul 14, 2022Updated 3 years ago
- Example project for consuming AWS Kinesis streamming and save data on Amazon Redshift using Apache Spark☆11May 22, 2018Updated 7 years ago
- ☆15Aug 18, 2021Updated 4 years ago
- Deploy to Railway using AI coding agents - Free Credits Offer • AdUse Claude Code, Codex, OpenCode, and more. Autonomous software development now has the infrastructure to match with Railway.
- Simple ETL pipeline using Python☆29May 22, 2023Updated 2 years ago
- an end-to-end data pipeline extracting music listening habits and producing an insightful dashboard☆17Mar 31, 2024Updated 2 years ago
- A real-time event pipeline around Kafka Ecosystem for Chicago Transit Authority.☆32Aug 14, 2023Updated 2 years ago
- Some example projects for Data Engineers to build, end-to-end.☆39Nov 8, 2023Updated 2 years ago
- This repo has some proposed agenda for Azure Machine Learning related hands-on workshops.☆11Feb 2, 2021Updated 5 years ago
- Portfolio of projects and studies conducted in data engineering.☆34Feb 22, 2025Updated last year
- ☆13Dec 28, 2023Updated 2 years ago
- Source code for the Modern Golang Programming course☆10Jul 6, 2017Updated 8 years ago
- ☆202Oct 10, 2023Updated 2 years ago
- Open source password manager - Proton Pass • AdSecurely store, share, and autofill your credentials with Proton Pass, the end-to-end encrypted password manager trusted by millions.
- A notebook showing how to easily convert a current notebook you have to a notebook that can be run on Kubeflow Pipelines.☆15Jul 15, 2020Updated 5 years ago
- Building Real Time Data Pipeline using Apache Kafka, Apache Spark, Hadoop, PostgreSQL, Django and Flexomonster on Docker to track status …☆24Dec 29, 2020Updated 5 years ago
- Knox plugin which streams all the files in an s3 bucket or folder.☆31Apr 9, 2023Updated 3 years ago
- 整理所有特征工程用到的方法,为了复用☆11Jan 11, 2021Updated 5 years ago
- Terraform Module to create a Apache Spark cluster on AWS☆16Jan 3, 2022Updated 4 years ago
- This project shows how to capture changes from postgres database and stream them into kafka☆42May 17, 2024Updated last year
- This application "listens" for a ticket creation event from Zendesk, analyses the ticket for negative sentiment, tags the ticket accordin…☆14Mar 10, 2025Updated last year
- Pipeline, warehouse, and visualization tools for investigating the impact of Airbnb short-term rentals on world cities.☆14Jun 9, 2023Updated 2 years ago
- ☆10Jun 22, 2020Updated 5 years ago
- Deploy open-source AI quickly and easily - Special Bonus Offer • AdRunpod Hub is built for open source. One-click deployment and autoscaling endpoints without provisioning your own infrastructure.
- Generates a tree of an S3 bucket contents☆11Sep 18, 2020Updated 5 years ago
- velib-v2: An ETL pipeline that employs batch and streaming jobs using Spark, Kafka, Airflow, and other tools, all orchestrated with Docke…☆20Aug 12, 2025Updated 8 months ago
- 14天完成数据分析实战项目☆10Sep 7, 2022Updated 3 years ago
- Skooldio: Data Pipelines with Airflow☆23May 24, 2025Updated 11 months ago
- 鲁伟《机器学习公式推导与代码实现》。整体对算法的分类是亮点。算法原理和代码实现也相对简单,可以和《机器学习实战》对比起来看。☆11Oct 19, 2022Updated 3 years ago
- Loan Default Prediction using PySpark, with jobs scheduled by Apache Airflow and Integration with Spark using Apache Livy☆22Dec 26, 2020Updated 5 years ago
- Alpha Streams Public SDK.☆13Mar 27, 2024Updated 2 years ago