A collection of data engineering projects: data modeling, ETL pipelines, data lakes, infrastructure configuration on AWS, data warehousing, containerization, and a dashboard to monitor data pipeline KPIs
☆15Apr 29, 2021Updated 5 years ago
Alternatives and similar repositories for Data_Engineering_Projects
Users that are interested in Data_Engineering_Projects are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- ☆15May 14, 2024Updated 2 years ago
- Constructed a dashboard with FastAPI that extracts data from the yfinance API to a SQLAlchemy database.☆21Mar 16, 2025Updated last year
- Capstone Project for the IBM Data Engineering Professional Certification.☆13Mar 7, 2022Updated 4 years ago
- Hands-On Low-Code Application Development with Salesforce, published by Packt☆11Jan 18, 2023Updated 3 years ago
- Repository containing example solutions for the Data Engineering Career Path Portfolio Projects☆18Sep 16, 2022Updated 3 years ago
- Managed hosting for WordPress and PHP on Cloudways • AdManaged hosting for WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Deploy in minutes on Cloudways by DigitalOcean.
- The Modern Data Stack in a (Smaller) Box☆12Jan 28, 2023Updated 3 years ago
- CI/CD repository template to automate deployments of your production flows☆15Jul 1, 2024Updated last year
- This data project can be used as a take-home assignment to learn Pyspark and Data Engineering.☆19Feb 19, 2023Updated 3 years ago
- 💳 ETL (Extract, Transform and Load) pipeline for calculating stats for a transactions database & testing the efficacy of a loyalty prog…☆10Apr 25, 2017Updated 9 years ago
- Example project for consuming AWS Kinesis streamming and save data on Amazon Redshift using Apache Spark☆11May 22, 2018Updated 8 years ago
- 🎩 AI-powered cover letter generator☆27Jul 13, 2025Updated 11 months ago
- Built a stream processing data pipeline to get data from disparate systems into a dashboard using Kafka as an intermediary.☆29Aug 14, 2023Updated 2 years ago
- Scan and monitor your network effortlessly! Nmap Prometheus Exporter provides insights into network health and security with Prometheus-c…☆15Oct 2, 2023Updated 2 years ago
- Udacity Data Engineering Nanodegree Project 3☆12Jul 14, 2019Updated 6 years ago
- Deploy to Railway using AI coding agents - Free Credits Offer • AdUse Claude Code, Codex, OpenCode, and more. Autonomous software development now has the infrastructure to match with Railway.
- Create a data pipeline on AWS to execute batch processing in a Spark cluster provisioned by Amazon EMR. ETL using managed airflow: extrac…☆10Jul 12, 2021Updated 4 years ago
- Simplified ETL process in Hadoop using Apache Spark. Has complete ETL pipeline for datalake. SparkSession extensions, DataFrame validatio…☆56May 6, 2023Updated 3 years ago
- Course Material Data Engineering on AWS Course☆31Sep 9, 2024Updated last year
- Data Engineering Hours With Experts Coding Challenge☆13Mar 16, 2026Updated 3 months ago
- End-to-end ELT data engineering project☆23Dec 24, 2022Updated 3 years ago
- Code for the Data Engineering Zoomcamp☆20Dec 12, 2022Updated 3 years ago
- AWS Certified Solutions Architect Associate exam preparation notes☆39Nov 11, 2020Updated 5 years ago
- A Covid-19 data pipeline on AWS featuring PySpark/Glue, Docker, Great Expectations, Airflow, and Redshift, templated in CloudFormation an…☆24Nov 21, 2023Updated 2 years ago
- ☆36Feb 6, 2024Updated 2 years ago
- Deploy to Railway using AI coding agents - Free Credits Offer • AdUse Claude Code, Codex, OpenCode, and more. Autonomous software development now has the infrastructure to match with Railway.
- This repository contains the code for a realtime election voting system. The system is built using Python, Kafka, Spark Streaming, Postgr…☆48Dec 11, 2023Updated 2 years ago
- Best practices for data workflows, integrations with the Modern Data Stack (MDS), Infrastructure as Code (IaC), Cloud Provider Services☆38Jan 21, 2026Updated 5 months ago
- Simple ETL pipeline using Python☆29May 22, 2023Updated 3 years ago
- 💯 Materials to help you rock your next coding interview☆12Aug 18, 2019Updated 6 years ago
- Loan Default Prediction using PySpark, with jobs scheduled by Apache Airflow and Integration with Spark using Apache Livy☆22Dec 26, 2020Updated 5 years ago
- SQLMesh example projects☆42Jul 2, 2025Updated 11 months ago
- DataTalks.Club's Data Engineering Zoomcamp Project☆24Jul 14, 2022Updated 3 years ago
- A project portfolio to accompany my resume☆30Sep 5, 2023Updated 2 years ago
- Hexagonal (ports and adapters) architecture applied to Spark and Python data engineering project☆33Jul 26, 2023Updated 2 years ago
- Virtual machines for every use case on DigitalOcean • AdGet dependable uptime with 99.99% SLA, simple security tools, and predictable monthly pricing with DigitalOcean's virtual machines, called Droplets.
- Firefox extension that shows parquet schema when going over GCP cloud storage. Use DuckDB WASM☆12Jan 19, 2024Updated 2 years ago
- data science interview questions company wise which include the data analyst , junior data scientist , machine learning engineer etc. pos…☆17Apr 20, 2022Updated 4 years ago
- Developed a data pipeline to automate data warehouse ETL by building custom airflow operators that handle the extraction, transformation,…☆89Nov 22, 2021Updated 4 years ago
- Bunch of Airflow Configurations and DAGs for Kubernetes, Spark based data-pipelines. Scale inside Kubernetes using spark kubernetes maste…☆23Feb 22, 2022Updated 4 years ago
- End-to-End BI & DW project: Data Warehousing design and modeling (MySQL), ETL (PDI) and Dashboard (Tableau)☆19Aug 10, 2020Updated 5 years ago
- This is a capstone project that entails building an end-to-end ETL (Extract-Transform-Load) Data pipeline which extracts UK accident and …☆18Jun 6, 2020Updated 6 years ago
- This repository contains independent project on Social Media Analytics to identify key predictors of social influence.☆22Aug 13, 2020Updated 5 years ago