Some example projects for Data Engineers to build, end-to-end.
☆39Nov 8, 2023Updated 2 years ago
Alternatives and similar repositories for DataEngineeringProjects
Users that are interested in DataEngineeringProjects are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- This is the end to end MLOps project I built through participated the MLOps Zoomcamp☆10Sep 11, 2022Updated 3 years ago
- A repo to track data engineering projects☆14Nov 11, 2022Updated 3 years ago
- A Data Engineering project. Repository for backend infrastructure and Streamlit app files for a Premier League Dashboard.☆256Dec 19, 2025Updated 6 months ago
- ☆13Dec 15, 2023Updated 2 years ago
- End-to-end data platform leveraging the Modern data stack☆52Apr 10, 2024Updated 2 years ago
- Deploy open-source AI quickly and easily - Special Bonus Offer • AdRunpod Hub is built for open source. One-click deployment and autoscaling endpoints without provisioning your own infrastructure.
- This is a collecton of Amazon CDK projects to show how to directly ingest streaming data from Amazon Mananged Service for Apache Kafka (M…☆16Sep 10, 2024Updated last year
- End to end data engineering project☆59Oct 27, 2022Updated 3 years ago
- how to unit test your PySpark code☆29Mar 26, 2021Updated 5 years ago
- Built a stream processing data pipeline to get data from disparate systems into a dashboard using Kafka as an intermediary.☆29Aug 14, 2023Updated 2 years ago
- An end to end ML project. Using MLflow for experiment tracking and model registry. Prefect for workflow orchestration. S3 for artifacts s…☆12Sep 11, 2022Updated 3 years ago
- Data Pipeline that utilizes GCP, Python 3.10, Prefect, and more.☆10Jan 23, 2023Updated 3 years ago
- This repo is for LinkedIn Learning course: Advanced RAG Applications with Vector Databases☆32Oct 17, 2024Updated last year
- Hexagonal (ports and adapters) architecture applied to Spark and Python data engineering project☆33Jul 26, 2023Updated 2 years ago
- Data Engineering Project to Extract and Process Solana Reddit Data☆40Feb 3, 2024Updated 2 years ago
- Managed hosting for WordPress and PHP on Cloudways • AdManaged hosting for WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Deploy in minutes on Cloudways by DigitalOcean.
- ☆23Jun 30, 2024Updated last year
- Simulate SDTM datasets in SAS.☆12Jan 19, 2018Updated 8 years ago
- A complete pipeline to pull data from Scryfall's "Magic: The Gathering"-API, via Prefect orchestration and dbt transformation.☆43Apr 27, 2023Updated 3 years ago
- Developed an ETL pipeline for a Data Lake that extracts data from S3, processes the data using Spark, and loads the data back into S3 as …☆17Oct 1, 2019Updated 6 years ago
- Template to use genai for chatting and via api to accelerate research☆34Feb 17, 2026Updated 4 months ago
- Welcome to my data engineering projects repository! Here you will find a collection of data engineering projects that I have worked on.☆25Apr 27, 2023Updated 3 years ago
- ☆21Mar 11, 2025Updated last year
- This repository contains the code snippets used in "LLM Prompt Engineering For Developers"☆14Apr 22, 2024Updated 2 years ago
- ☆22Jul 24, 2023Updated 2 years ago
- Simple, predictable pricing with DigitalOcean hosting • AdAlways know what you'll pay with monthly caps and flat pricing. Enterprise-grade infrastructure trusted by 600k+ customers.
- A production-grade data pipeline has been designed to automate the parsing of user search patterns to analyze user engagement. Extract d…☆24Nov 22, 2021Updated 4 years ago
- ☆16Apr 26, 2024Updated 2 years ago
- Coding with ChatGPT and other LLMs, published by Packt☆16Dec 9, 2024Updated last year
- Using DuckDB with AWS Lambda to process Delta Lake data☆34Jan 26, 2025Updated last year
- ☆276Jun 7, 2026Updated 3 weeks ago
- A simple script for backing up your favorite YouTube channels.☆12Jan 27, 2024Updated 2 years ago
- A curated list of awesome SQLMesh resources☆38Apr 30, 2025Updated last year
- This Guidance helps customers design a resilient batch process application using AWS services☆19Mar 1, 2026Updated 3 months ago
- Sample code for building a Python application for Apache Flink on Kinesis Data Analytics.☆14Aug 30, 2023Updated 2 years ago
- Deploy on Railway without the complexity - Free Credits Offer • AdConnect your repo and Railway handles the rest with instant previews. Quickly provision container image services, databases, and storage volumes.
- Clinical trial data analytic recipes in R for SAS users☆30Sep 3, 2024Updated last year
- plugin to check spacing between sentences☆10Sep 10, 2023Updated 2 years ago
- Comprehensive Python client for the Uniprot REST API☆57Oct 6, 2025Updated 8 months ago
- ☆11Mar 24, 2021Updated 5 years ago
- An experimental attempt to make a CLI for supply-chain modeling for Helpful Engineering's Project Data☆10Oct 29, 2023Updated 2 years ago
- MCP server for Grok AI API integration☆24Jun 2, 2025Updated last year
- This web analytics demo shows how to collect web logs with API Gateway and store them into S3 through Amazon Kinesis. Then this project s…☆21Apr 4, 2025Updated last year