josephmachado/e2e_datapipeline_test

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/josephmachado/e2e_datapipeline_test)

josephmachado / e2e_datapipeline_test

Example repo to create end to end tests for data pipeline.

☆25

Alternatives and similar repositories for e2e_datapipeline_test

Users that are interested in e2e_datapipeline_test are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

josephmachado / data-quality-w-greatexpectations
View on GitHub
Code for data quality with greatexpectations blog
☆13Jul 30, 2024Updated last year
josephmachado / simple_polars_etl
View on GitHub
☆16Apr 26, 2024Updated 2 years ago
josephmachado / online_store
View on GitHub
End to end data engineering project
☆59Oct 27, 2022Updated 3 years ago
josephmachado / data_helper
View on GitHub
Code to help generate SQL for stakeholders. Code at https://www.startdataengineering.com/post/data-democratize-llm/
☆13May 24, 2024Updated 2 years ago
josephmachado / iceberg-features
View on GitHub
☆14Dec 11, 2023Updated 2 years ago
Managed Database hosting by DigitalOcean • Ad
PostgreSQL, MySQL, MongoDB, Kafka, Valkey, and OpenSearch available. Automatically scale up storage and focus on building your apps.
josephmachado / data_engineering_best_practices_log
View on GitHub
Code to demonstrate data engineering metadata & logging best practices
☆22Mar 12, 2024Updated 2 years ago
josephmachado / socialetl
View on GitHub
Project for "Data pipeline design patterns" blog.
☆53Aug 6, 2024Updated last year
raashidsalih / churn-pipeline
View on GitHub
A custom end-to-end analytics platform for customer churn
☆10May 15, 2025Updated last year
josephmachado / beginner_de_project_stream
View on GitHub
Simple stream processing pipeline
☆112Jun 17, 2024Updated 2 years ago
josephmachado / bitcoinMonitor
View on GitHub
Near real time ETL to populate a dashboard.
☆75Sep 9, 2025Updated 10 months ago
josephmachado / data-engineering-interview-series
View on GitHub
Repository for Data Engineering Interview Series
☆41Oct 17, 2024Updated last year
damklis / etljob
View on GitHub
Simple ETL pipeline using Python
☆29May 22, 2023Updated 3 years ago
lightbend / spark-history-server-docker
View on GitHub
Docker image for Spark history server on Kubernetes
☆15Mar 13, 2020Updated 6 years ago
itversity / pyspark
View on GitHub
Repository for Spark using Python material. It is popularly known as PySpark.
☆21Aug 18, 2021Updated 4 years ago
Managed hosting for WordPress and PHP on Cloudways • Ad
Managed hosting for WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Deploy in minutes on Cloudways by DigitalOcean.
josephmachado / adv_data_transformation_in_sql
View on GitHub
Code for "Advanced data transformations in SQL" free live workshop
☆93May 5, 2025Updated last year
keyiflerolsun / A101AktuelRobot
View on GitHub
Github Workflows üzerinde Çalışan A101 Aktüel Telegam Bot
☆14Sep 29, 2023Updated 2 years ago
GoogleEngineerExplains / LeetCode-Notes
View on GitHub
☆10May 19, 2022Updated 4 years ago
daihuynh / dagster_dbt_metabase_simple_solution
View on GitHub
A simple Data Engineering solution for testing or education purposes. You only need to know SQL and Python to understand this project. Da…
☆29Jul 2, 2022Updated 4 years ago
ricksladkey / generateDS
View on GitHub
Generate Python data structures and XML parser from Xschema (Python 3 port)
☆12Jan 13, 2015Updated 11 years ago
josephmachado / change_data_capture
View on GitHub
Repo for CDC with debezium blog post
☆30Sep 15, 2024Updated last year
databricks-demos / dbt-databricks-c360
View on GitHub
Demo running DBT as a Databricks Workflow task
☆13Nov 13, 2024Updated last year
devlace / azure-databricks-storage
View on GitHub
Different ways to connect to storage in Azure Databricks
☆11Jul 19, 2019Updated 7 years ago
josephmachado / beginner_de_project
View on GitHub
Beginner data engineering project - batch edition
☆584Apr 13, 2026Updated 3 months ago
Deploy to Railway using AI coding agents - Free Credits Offer • Ad
Use Claude Code, Codex, OpenCode, and more. Autonomous software development now has the infrastructure to match with Railway.
NYCPlanning / data-engineering
View on GitHub
Primary repository for NYC DCP's Data Engineering team
☆42Updated this week
dataprofessor / streamlit-adjust-css
View on GitHub
☆11Jan 9, 2022Updated 4 years ago
Foroozani / BigData_PySpark
View on GitHub
Handle Big Data for Machine Learning using Python and PySpark, Building ETL Pipelines with PySpark, MongoDB, and Bokeh
☆10Nov 12, 2021Updated 4 years ago
chandra1sekar / data-engineering
View on GitHub
☆32Aug 13, 2018Updated 7 years ago
jkwd / spotify
View on GitHub
☆10Oct 20, 2022Updated 3 years ago
ssp-data / data-engineering-devops
View on GitHub
Full stack data engineering tools and infrastructure set-up
☆58Feb 13, 2021Updated 5 years ago
aws-samples / aws-lambda-redshift-event-driven-app
View on GitHub
Building Event Driven Application with AWS Lambda and Amazon Redshift Data API
☆17Oct 27, 2020Updated 5 years ago
VianneyMI / baker
View on GitHub
Baker is an AI powered app that helps you find recipes and avoid food waste
☆14Jan 4, 2025Updated last year
alvintoh / udemy-hands-on-hadoop
View on GitHub
AlvinToh Learning Repository for The Ultimate Hands-On Hadoop - Tame your Big Data!
☆10May 23, 2018Updated 8 years ago
Wordpress hosting with auto-scaling - Free Trial Offer • Ad
Fully Managed hosting for WordPress and WooCommerce businesses that need reliable, auto-scalable performance. Cloudways SafeUpdates now available.
parmsam / quarto-quizdown
View on GitHub
Quizdown extension for HTML in Quarto
☆15Apr 12, 2025Updated last year
OpenDataAlex / etlTest
View on GitHub
Automated and tool agnostic data integration testing tool.
☆11Mar 29, 2022Updated 4 years ago
ruthussanketh / natural-language-processing
View on GitHub
Codes, datasets, and explanations for some basic natural language tasks and models.
☆11Dec 9, 2020Updated 5 years ago
QuentinAmbard / databricks-demo
View on GitHub
Simple demo for Databricks!
☆14Sep 11, 2023Updated 2 years ago
Anam-Mahmood / Introduction-to-Big-Data-analysis-Machine-Learning-in-Python-with-PySpark
View on GitHub
☆10May 26, 2021Updated 5 years ago
jalajthanaki / POS-tag-workshop
View on GitHub
Understanding of POS tags and build a POS tagger from scratch
☆11Jun 9, 2018Updated 8 years ago
bananaml / serverless-template-gptj
View on GitHub
☆19May 31, 2023Updated 3 years ago