monksy/awesome-data-engineering

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/monksy/awesome-data-engineering)

monksy / awesome-data-engineering

A curated list of data engineering tools for software developers

☆13

Alternatives and similar repositories for awesome-data-engineering

Users that are interested in awesome-data-engineering are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

JuliaGraphs / LightGraphsMatching.jl
View on GitHub
Matching algorithms for LightGraphs.jl
☆13Oct 21, 2021Updated 4 years ago
ahmadmursaleen / Vuln-Analysis-ML-Python
View on GitHub
This project deals with vulnerability analysis and classification using machine learning techniques i.e. Natural Language Processing.
☆10Feb 21, 2019Updated 7 years ago
thuva4 / Bigdata-Papers-Reading
View on GitHub
A curated list of Big data papers reading for anyone who are eager to learn!
☆30Dec 22, 2024Updated last year
softdevteam / depub
View on GitHub
Reduce the visibility of elements in a Rust code base
☆17Jun 19, 2026Updated last month
NajiElKotob / Awesome-BigData
View on GitHub
Big Data Resources and References
☆13Sep 4, 2024Updated last year
End-to-end encrypted email - Proton Mail • Ad
Special offer: 40% Off Yearly / 80% Off First Month. All Proton services are open source and independently audited for security.
monksy / awesome-kafka
View on GitHub
A collection of kafka-resources
☆216May 13, 2026Updated 2 months ago
robbyrussell / programing-best-practices-2023
View on GitHub
A curated list of awesome Programming Best Practices 2023
☆11Jan 2, 2023Updated 3 years ago
mithunmanohar / awsome-snowflake
View on GitHub
A curated list of awesome Snowflake analytic data warehouse learning resources
☆23Mar 1, 2021Updated 5 years ago
AnnaVM / Project_Plotline
View on GitHub
Capstone project for Galvanize - Data Science Immersive. 'Project Plotline' looks at the emotional content of movie scripts (web scraping…
☆16Sep 27, 2016Updated 9 years ago
reisdebora / awesome-databricks
View on GitHub
A curated list of awesome Databricks resources, including Spark
☆22Jun 28, 2024Updated 2 years ago
harishd10 / recon
View on GitHub
Recon - A fast algorithm to compute Reeb graphs
☆16Aug 27, 2014Updated 11 years ago
tialaramex / misfortunate
View on GitHub
Perverse implementations of safe Rust traits
☆23Dec 21, 2025Updated 7 months ago
ogbinar / py-dataengineering-workshop
View on GitHub
From CSV to Dashboard — Building a Mini Data Pipeline in Pure Python
☆27Oct 26, 2025Updated 9 months ago
nursnaaz / Data-Science-Training-Python-
View on GitHub
This is a material for 2 days machine learning workshop conducted in Chennai on Jan 6th and 7th
☆15Feb 6, 2018Updated 8 years ago
Open source password manager - Proton Pass • Ad
Securely store, share, and autofill your credentials with Proton Pass, the end-to-end encrypted password manager trusted by millions.
PacktPublishing / Big-Data-Architects-Handbook
View on GitHub
Big Data Architect’s Handbook, published by Packt
☆21Jan 30, 2023Updated 3 years ago
RobertoDebarba / microservices-python-example
View on GitHub
A simple microservice project using Python, RabbitMQ, Nameko and Flask
☆19May 28, 2016Updated 10 years ago
Consensys / orchestrate
View on GitHub
Orchestrate is a blockchain Transaction Orchestration system that can operate multiple chains simultaneously
☆22Jun 24, 2024Updated 2 years ago
aws-samples / aws-lambda-etl-ref-architecture
View on GitHub
This reference architecture demonstrates the use of AWS Step Functions to orchestrate an Extract Transfer Load (ETL) workflow with AWS La…
☆24Jun 16, 2020Updated 6 years ago
chandu-muthyala / Data-Engineer-Nano-Degree
View on GitHub
Data models, build data warehouses and data lakes, automate data pipelines, and worked with massive datasets.
☆12Jul 16, 2019Updated 7 years ago
kafbat / kafka-leader-election
View on GitHub
This Java library has been designed to facilitate leader election within Kafka clusters providing an efficient and robust solution for di…
☆30Jun 9, 2023Updated 3 years ago
JuliaGraphs / LightGraphsFlows.jl
View on GitHub
Flow algorithms on LightGraphs
☆35Jan 22, 2022Updated 4 years ago
mkdasher / SM64Lua
View on GitHub
☆19May 31, 2023Updated 3 years ago
curiousest / predict-AQI
View on GitHub
Predicting air pollution (machine learning project)
☆21Feb 19, 2017Updated 9 years ago
Simple, predictable pricing with DigitalOcean hosting • Ad
Always know what you'll pay with monthly caps and flat pricing. Enterprise-grade infrastructure trusted by 600k+ customers.
pigiuz / engineering-management
View on GitHub
A collection of books/articles/resources that helped me growing as a manager
☆55Mar 20, 2022Updated 4 years ago
Jayvardhan-Reddy / BigData-Ecosystem-Architecture
View on GitHub
Life-cycle: Internal working of HDFS, SQOOP, HIVE, SPARK, HBASE, KAFKA with code.
☆15Sep 10, 2019Updated 6 years ago
RiskThinking / work-samples
View on GitHub
Public facing work samples for technical hiring assessment
☆20Jan 19, 2024Updated 2 years ago
naikshubham / Predictive-Analytics-in-Python
View on GitHub
Build ML model with meaningful variables. Use model for predictions
☆17Nov 22, 2022Updated 3 years ago
yangcvo / Zabbix-Monitoring-Kafka
View on GitHub
Zabbix-Monitoring Kafka集群 Brokers服务,Kafka Consumer Monitoring
☆11Jun 7, 2017Updated 9 years ago
itversity / mastering-redshift
View on GitHub
☆16Jul 31, 2022Updated 3 years ago
katreparitosh / Discourse-Analytics-of-Political-Speech-Transcripts
View on GitHub
Political Discourse Analysis (PDA) of Political Speech Transcripts using Natural Language Processing (NLP)
☆17Apr 28, 2021Updated 5 years ago
PacktPublishing / Real-world-Machine-Learning-Projects-using-TensorFlow
View on GitHub
Real world Machine Learning Projects using TensorFlow by Packt Publishing
☆14Jan 15, 2021Updated 5 years ago
x64dbg / blog
View on GitHub
Blog for x64dbg.
☆13Jul 18, 2026Updated last week
Bare Metal GPUs on DigitalOcean Gradient AI • Ad
Purpose-built for serious AI teams training foundational models, running large-scale inference, and pushing the boundaries of what's possible.
paritytech / sub-flood
View on GitHub
Flooding substrate node with transactions
☆27Mar 26, 2022Updated 4 years ago
shawntabrizi / substrate-feeless-token-factory
View on GitHub
Alternative fee mechanics for managing token assets on Substrate.
☆27Nov 21, 2019Updated 6 years ago
smsilva / terraform-packager
View on GitHub
Terraform Packager Scripts
☆12May 5, 2026Updated 2 months ago
Warchant / cmake-hunter-seed
View on GitHub
Seed project for C++ projects
☆27Jul 14, 2021Updated 5 years ago
techsuppdiva / spark-cheat-sheets
View on GitHub
This repo stores my Spark Tutorial slides.
☆15Feb 8, 2016Updated 10 years ago
kbrebanov / ansible-squid
View on GitHub
Ansible Squid role
☆13Sep 24, 2018Updated 7 years ago
Bluteshi / awesome-climate-research
View on GitHub
A curated list of resources about various climate research topics
☆24Jan 8, 2023Updated 3 years ago