zhujun98/data-engineering

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/zhujun98/data-engineering)

zhujun98 / data-engineering

Spark, Airflow, Kafka

☆24

Alternatives and similar repositories for data-engineering

Users that are interested in data-engineering are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

KumarRoshandot / AirFlow_Kafka_Spark_Docker
View on GitHub
This is a recipe for docker container based architecture based on airflow, kafka,spark,docker
☆19Oct 15, 2024Updated last year
TJaniF / airflow-kafka-quickstart
View on GitHub
A self-contained, ready to run Airflow and Kafka project. Can be run locally or within codespaces.
☆16Jul 15, 2023Updated 3 years ago
PritomDas / Real-Time-Streaming-Data-Pipeline-and-Dashboard
View on GitHub
Building Real Time Data Pipeline using Apache Kafka, Apache Spark, Hadoop, PostgreSQL, Django and Flexomonster on Docker to track status …
☆24Dec 29, 2020Updated 5 years ago
shabie / streaming_nd
View on GitHub
Data Streaming Nanodegree (from Udacity) exercises, projects and their solutions
☆17Aug 14, 2023Updated 2 years ago
danieldiamond / data-engineering-capstone
View on GitHub
Data Engineering Capstone Project: ETL Pipelines and Data Warehouse Development
☆22Jul 9, 2019Updated 7 years ago
Deploy on Railway without the complexity - Free Credits Offer • Ad
Connect your repo and Railway handles the rest with instant previews. Quickly provision container image services, databases, and storage volumes.
DucAnhNTT / bigdata-ETL-pipeline
View on GitHub
The Data Pipeline and Analytics Stack is a comprehensive solution designed for processing, storing, and visualizing data. Explore a compl…
☆18Dec 26, 2023Updated 2 years ago
ybangaru / wallstreetbets-sentiment-analysis
View on GitHub
☆10May 24, 2021Updated 5 years ago
darenasc / data-science-for-good
View on GitHub
Data Science for Good links.
☆14Nov 10, 2021Updated 4 years ago
greole / foamMon
View on GitHub
A simple tool for monitoring the progress of OpenFOAM simulations
☆13Nov 9, 2018Updated 7 years ago
fbob / mplFOAM
View on GitHub
Some functions to plot OpenFOAM data with Matplotlib
☆11Apr 15, 2021Updated 5 years ago
florimondmanca / kafka-fraud-detector
View on GitHub
🚨 Simple, self-contained fraud detection system built with Apache Kafka and Python
☆92Apr 29, 2019Updated 7 years ago
argx / fake-fews
View on GitHub
Candidate solution for Facebook's fake news problem using machine learning and crowd-sourcing. Takes form of a Chrome extension. Develope…
☆13Aug 25, 2017Updated 8 years ago
angelddaz / de-challenges
View on GitHub
Project based learning for Data Engineering fundamentals.
☆13Jan 15, 2021Updated 5 years ago
LaurentRisser / DS_project_ETL_with_AWS_Twitter
View on GitHub
Setup an ETL from Twitter API to S3
☆10Nov 20, 2020Updated 5 years ago
Wordpress hosting with auto-scaling - Free Trial Offer • Ad
Fully Managed hosting for WordPress and WooCommerce businesses that need reliable, auto-scalable performance. Cloudways SafeUpdates now available.
akashsethi24 / Machine-Learning
View on GitHub
Examples of all Machine Learning Algorithm in Apache Spark
☆15Nov 2, 2017Updated 8 years ago
Speccy-Rom / SpeccyTV
View on GitHub
💻👨‍💻⌨️️ Streaming service with ETL on steroids
☆19Feb 23, 2025Updated last year
IremErturk / dtc-de-capstone-project
View on GitHub
☆11Apr 9, 2022Updated 4 years ago
Noodle-ai / mlflow_part1_condaEnv
View on GitHub
Introduction to MLflow and Using MLflow with an Anaconda Environment
☆11Sep 17, 2020Updated 5 years ago
mdolab / pyofm
View on GitHub
Python wrapper for OpenFOAM meshes
☆13Sep 16, 2025Updated 10 months ago
hfhoffman1144 / smartphone_sensor_stream2
View on GitHub
Stream smartphone sensor data with FastAPI, Kafka, ksqlDB, and Docker.
☆11Aug 3, 2023Updated 2 years ago
tkh5044 / portfolio
View on GitHub
My professional portfolio with some of my best data science projects.
☆11Jun 22, 2017Updated 9 years ago
Apress / beginning-apache-spark-3
View on GitHub
Source Code for 'Beginning Apache Spark 3' by Hien Luu
☆13Oct 14, 2021Updated 4 years ago
petebachant / foamPy
View on GitHub
A Python package for working with OpenFOAM.
☆15Oct 2, 2024Updated last year
Deploy on Railway without the complexity - Free Credits Offer • Ad
Connect your repo and Railway handles the rest with instant previews. Quickly provision container image services, databases, and storage volumes.
scheingraber / cfd_with_chemical_reactions
View on GitHub
Group Project: CFD solver taking heat into account, with transport of chemical substances and chemical reactions.
☆12Oct 24, 2017Updated 8 years ago
narayave / Insight-GDELT-Feed
View on GitHub
A way for home buyers to know about factors affecting a state
☆48Mar 2, 2019Updated 7 years ago
guesswh0 / face_engine
View on GitHub
Facial recognition engine
☆10Jul 12, 2026Updated last week
aravindr18 / RedditR--Insight-Data-Engineering-Project
View on GitHub
RedditR for Content Engagement and Recommendation
☆18Dec 21, 2017Updated 8 years ago
dpage / ml-experiments
View on GitHub
Scripts and code written whilst learning and experimenting with machine learning
☆12Jul 18, 2022Updated 4 years ago
dmitrypol / redis_data
View on GitHub
using Redis for data science and data engineering
☆16Jan 14, 2020Updated 6 years ago
AdriaPadilla / Twitter-API-V2-full-archive-Search-academics
View on GitHub
Python Script to search and extract tweets from TWITTER using API v2.
☆19Sep 20, 2024Updated last year
abduldjafar / elt-with-dagster
View on GitHub
☆15Aug 3, 2022Updated 3 years ago
redpanda-data-university / rp-use-cases-algo-trading
View on GitHub
☆11Aug 20, 2024Updated last year
Managed Database hosting by DigitalOcean • Ad
PostgreSQL, MySQL, MongoDB, Kafka, Valkey, and OpenSearch available. Automatically scale up storage and focus on building your apps.
vijaykothareddy / Data-Engineering
View on GitHub
Code for my blogs on Data Engineering
☆15Nov 9, 2020Updated 5 years ago
manuel-lang / Data-Engineering-Nanodegree
View on GitHub
Solution to all projects of Udacity's Data Engineering Nanodegree: Data Modeling with Postgres & Cassandra, Data Warehouse with Redshift,…
☆58Oct 20, 2022Updated 3 years ago
AprendizajeProfundo / Modelamiento-Metodos-Numericos
View on GitHub
☆17Feb 5, 2021Updated 5 years ago
fabiogjardim / datalab
View on GitHub
☆10Jan 27, 2025Updated last year
aws-samples / aws-sagemaker-ml-blog-predictive-campaigns
View on GitHub
Deliver Pinpoint Campaigns Driven by Machine Learning on AWS SageMaker
☆18Feb 10, 2019Updated 7 years ago
aws-samples / aws-autonomous-driving-data-lake-image-extraction-pipeline-from-ros-bagfiles
View on GitHub
This workshop will familiarize you with some of the key steps towards building an autonomous driving data lake and extracting images from…
☆11Jul 12, 2022Updated 4 years ago
PacktPublishing / Distributed-Data-Systems-with-Azure-Databricks
View on GitHub
Distributed Data Systems with Azure Databricks, published by Packt
☆12Jan 18, 2023Updated 3 years ago