Repository for Data Engineering Interview Series
☆39Oct 17, 2024Updated last year
Alternatives and similar repositories for data-engineering-interview-series
Users that are interested in data-engineering-interview-series are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- Code for data quality with greatexpectations blog☆13Jul 30, 2024Updated last year
- ☆16Apr 26, 2024Updated 2 years ago
- Example repo to create end to end tests for data pipeline.☆25Jun 14, 2024Updated 2 years ago
- Repo for CDC with debezium blog post☆29Sep 15, 2024Updated last year
- Step by step instructions to create a production-ready data pipeline☆61Dec 23, 2024Updated last year
- Managed Kubernetes at scale on DigitalOcean • AdDigitalOcean Kubernetes includes the control plane, bandwidth allowance, container registry, automatic updates, and more for free.
- Code to help generate SQL for stakeholders. Code at https://www.startdataengineering.com/post/data-democratize-llm/☆13May 24, 2024Updated 2 years ago
- Code for blog at: https://www.startdataengineering.com/post/docker-for-de/☆40Apr 29, 2024Updated 2 years ago
- Code to demonstrate data engineering metadata & logging best practices☆21Mar 12, 2024Updated 2 years ago
- ☆14Dec 11, 2023Updated 2 years ago
- A custom end-to-end analytics platform for customer churn☆10May 15, 2025Updated last year
- Rust And Delta Demo. Explanation and walkthrough on delta-rs☆10Aug 21, 2023Updated 2 years ago
- reating a modern data pipeline using a combination of Terraform, AWS Lambda and S3, Snowflake, DBT, Mage AI, and Dash.☆15Jun 26, 2023Updated 2 years ago
- This project serves as a comprehensive guide to building an end-to-end data engineering pipeline using TCP/IP Socket, Apache Spark, OpenA…☆45Jan 4, 2024Updated 2 years ago
- An end-to-end ELT pipeline to store simulated heart rate data inside a data warehouse; uses Kafka for real-time processing, Airbyte for d…☆15May 28, 2024Updated 2 years ago
- Managed Kubernetes at scale on DigitalOcean • AdDigitalOcean Kubernetes includes the control plane, bandwidth allowance, container registry, automatic updates, and more for free.
- 65 Articles on SQL: A Comprehensive Guide to Mastering Advanced SQL☆11Jun 7, 2023Updated 3 years ago
- End-to-End ELT data pipeline with Postgres, Airbyte, dbt, Dagster, Snowflake and Metabase☆11Jul 13, 2023Updated 2 years ago
- End to end data engineering project☆59Oct 27, 2022Updated 3 years ago
- ☆16Aug 29, 2023Updated 2 years ago
- ☆14Apr 9, 2024Updated 2 years ago
- Code for blog at https://www.startdataengineering.com/post/python-for-de/☆107May 26, 2026Updated 2 weeks ago
- Simple stream processing pipeline☆112Jun 17, 2024Updated last year
- Cost Efficient Data Pipelines with DuckDB☆61May 14, 2025Updated last year
- Project for "Data pipeline design patterns" blog.☆52Aug 6, 2024Updated last year
- Bare Metal GPUs on DigitalOcean Gradient AI • AdPurpose-built for serious AI teams training foundational models, running large-scale inference, and pushing the boundaries of what's possible.
- This repository hosts materials for the Docker for Data Engineers workshop, offering hands-on exercises and resources tailored for data e…☆17May 23, 2024Updated 2 years ago
- Near real time ETL to populate a dashboard.☆75Sep 9, 2025Updated 9 months ago
- Ecommerce Realtime Data Pipeline (Data Modeling, Workflow Orchestration, Change Data Capture, Analytical Database and Dashboarding)☆69Mar 9, 2024Updated 2 years ago
- A project from the ml_ops Zoomcamp (DataTalks) using Semiconductor data☆22Sep 7, 2022Updated 3 years ago
- LoL Esports Voice Analytics Capstone Project☆13Aug 18, 2025Updated 9 months ago
- Sample repo for startdataengineering DE 101 free course☆74Jun 24, 2024Updated last year
- A library for generating pseudo-random (but "realistic") data in python. A port of the faker gem to python (making use of its rich locale…☆19Oct 16, 2014Updated 11 years ago
- The typed graph between your code and whichever warehouse, table format, or query engine you've chosen — typed compiler, branches, replay…☆265Updated this week
- A pipeline orchestration tool☆35Aug 2, 2024Updated last year
- GPU virtual machines on DigitalOcean Gradient AI • AdGet to production fast with high-performance AMD and NVIDIA GPUs you can spin up in seconds. The definition of operational simplicity.
- Final Project for Data Engineering Zoomcamp Course 2024 🧙🔥☆11Apr 17, 2024Updated 2 years ago
- ☆10Aug 20, 2024Updated last year
- ☆10May 24, 2021Updated 5 years ago
- People ask me about data science resources so I've curated some here: this is <<20% of the size of an 'awesome' list but has 80% of the v…☆11Jan 14, 2023Updated 3 years ago
- In this project, we have to create a predictive model which allows the company to maximize the profit of the next marketing campaign☆16Oct 18, 2025Updated 7 months ago
- Minimalistic, standalone alternative fake data generator with no dependencies☆21Jun 4, 2026Updated last week
- files created in ardan labs golang training☆12Nov 8, 2023Updated 2 years ago