Project for "Data pipeline design patterns" blog.
☆51Aug 6, 2024Updated last year
Alternatives and similar repositories for socialetl
Users that are interested in socialetl are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- Example repo to create end to end tests for data pipeline.☆25Jun 14, 2024Updated last year
- Code to demonstrate data engineering metadata & logging best practices☆21Mar 12, 2024Updated 2 years ago
- Code for data quality with greatexpectations blog☆13Jul 30, 2024Updated last year
- Code to help generate SQL for stakeholders. Code at https://www.startdataengineering.com/post/data-democratize-llm/☆13May 24, 2024Updated 2 years ago
- ☆16Apr 26, 2024Updated 2 years ago
- Managed hosting for WordPress and PHP on Cloudways • AdManaged hosting for WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Deploy in minutes on Cloudways by DigitalOcean.
- A custom end-to-end analytics platform for customer churn☆10May 15, 2025Updated last year
- Repo for CDC with debezium blog post☆29Sep 15, 2024Updated last year
- Repository for Data Engineering Interview Series☆38Oct 17, 2024Updated last year
- Step by step instructions to create a production-ready data pipeline☆61Dec 23, 2024Updated last year
- End to end data engineering project☆59Oct 27, 2022Updated 3 years ago
- Code for blog at: https://www.startdataengineering.com/post/docker-for-de/☆40Apr 29, 2024Updated 2 years ago
- Near real time ETL to populate a dashboard.☆75Sep 9, 2025Updated 8 months ago
- Sample project to demonstrate data engineering best practices☆219Feb 24, 2024Updated 2 years ago
- Code for dbt tutorial☆178Sep 9, 2025Updated 8 months ago
- Managed hosting for WordPress and PHP on Cloudways • AdManaged hosting for WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Deploy in minutes on Cloudways by DigitalOcean.
- Cost Efficient Data Pipelines with DuckDB☆61May 14, 2025Updated last year
- Example files used in the DuckDB - Unity Catalog blog☆10Dec 6, 2024Updated last year
- Spark, Airflow, Kafka☆24Apr 30, 2023Updated 3 years ago
- Local development environment for python data projects, with Docker☆23Dec 14, 2022Updated 3 years ago
- ☆21Mar 26, 2023Updated 3 years ago
- Code for "Advanced data transformations in SQL" free live workshop☆92May 5, 2025Updated last year
- This project aims to build a streaming application to perform real-time analytics of Covid-19 related tweets and deploy an ML model for r…☆14Jul 15, 2021Updated 4 years ago
- Practical FP in Scala book by Gabriel Volpe. Implementation with my view☆17Aug 17, 2024Updated last year
- Twitch Stream Analysis with Apache Spark and Apache Zeppelin☆12Aug 8, 2016Updated 9 years ago
- GPUs on demand by Runpod - Special Offer Available • AdRun AI, ML, and HPC workloads on powerful cloud GPUs—without limits or wasted spend. Deploy GPUs in under a minute and pay by the second.
- Primary repository for NYC DCP's Data Engineering team☆40May 21, 2026Updated last week
- A Generic System to validate query params and filter data in Django Rest Framework☆16Apr 28, 2026Updated last month
- Compare 2 basketball players by reading/comparing NBA stats in an Excel sheet.☆11Aug 19, 2018Updated 7 years ago
- Slowly Changing Dimension Type 2 (scd2) custom materialization☆11Apr 6, 2026Updated last month
- A dumb auditing service☆23Updated this week
- Bits of code that I'm sharing with the world to hopefully make your life a little easier!☆17May 22, 2025Updated last year
- This project shows how to capture changes from postgres database and stream them into kafka☆42May 17, 2024Updated 2 years ago
- Simplified ETL process in Hadoop using Apache Spark. Has complete ETL pipeline for datalake. SparkSession extensions, DataFrame validatio…☆56May 6, 2023Updated 3 years ago
- A simple and easy to use Data Quality (DQ) tool built with Python.☆51Sep 7, 2023Updated 2 years ago
- Deploy on Railway without the complexity - Free Credits Offer • AdConnect your repo and Railway handles the rest with instant previews. Quickly provision container image services, databases, and storage volumes.
- Events about the open source data stack☆13Apr 16, 2022Updated 4 years ago
- Code for blog at https://www.startdataengineering.com/post/python-for-de/☆105Jun 7, 2024Updated last year
- The refactoring tutorial I wrote for PyConDE 2022. You can also work through the exercises on your own.☆19Apr 22, 2024Updated 2 years ago
- CLI tool which applies common patches to music tags.☆22Jan 1, 2026Updated 4 months ago
- The code from the whylogs workshop in DataTalks.Club on 29 March 2022☆13Mar 29, 2022Updated 4 years ago
- A template repository to create a data project with IAC, CI/CD, Data migrations, & testing☆293Jul 11, 2024Updated last year
- Utility functions for dbt projects running on Spark☆36Dec 17, 2025Updated 5 months ago