Example repo to create end to end tests for data pipeline.
☆25Jun 14, 2024Updated last year
Alternatives and similar repositories for e2e_datapipeline_test
Users that are interested in e2e_datapipeline_test are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- ☆16Apr 26, 2024Updated 2 years ago
- End to end data engineering project☆59Oct 27, 2022Updated 3 years ago
- Code to help generate SQL for stakeholders. Code at https://www.startdataengineering.com/post/data-democratize-llm/☆13May 24, 2024Updated 2 years ago
- Repository for Data Engineering Interview Series☆38Oct 17, 2024Updated last year
- Simple repo to demonstrate how to submit a spark job to EMR from Airflow☆34Oct 18, 2020Updated 5 years ago
- Deploy to Railway using AI coding agents - Free Credits Offer • AdUse Claude Code, Codex, OpenCode, and more. Autonomous software development now has the infrastructure to match with Railway.
- Code to demonstrate data engineering metadata & logging best practices☆21Mar 12, 2024Updated 2 years ago
- reating a modern data pipeline using a combination of Terraform, AWS Lambda and S3, Snowflake, DBT, Mage AI, and Dash.☆15Jun 26, 2023Updated 2 years ago
- A custom end-to-end analytics platform for customer churn☆10May 15, 2025Updated last year
- Near real time ETL to populate a dashboard.☆75Sep 9, 2025Updated 8 months ago
- Simple ETL pipeline using Python☆29May 22, 2023Updated 3 years ago
- Step by step instructions to create a production-ready data pipeline☆60Dec 23, 2024Updated last year
- Docker image for Spark history server on Kubernetes☆15Mar 13, 2020Updated 6 years ago
- Repository for Spark using Python material. It is popularly known as PySpark.☆20Aug 18, 2021Updated 4 years ago
- Code for "Advanced data transformations in SQL" free live workshop☆92May 5, 2025Updated last year
- Managed Database hosting by DigitalOcean • AdPostgreSQL, MySQL, MongoDB, Kafka, Valkey, and OpenSearch available. Automatically scale up storage and focus on building your apps.
- Generate Python data structures and XML parser from Xschema (Python 3 port)☆12Jan 13, 2015Updated 11 years ago
- Repo for CDC with debezium blog post☆29Sep 15, 2024Updated last year
- The Demo for Blog: Modularization using Python and Docker (MicroService)☆12Feb 4, 2021Updated 5 years ago
- Tarot widget for website☆12Jan 6, 2023Updated 3 years ago
- ☆10Nov 30, 2024Updated last year
- Different ways to connect to storage in Azure Databricks☆11Jul 19, 2019Updated 6 years ago
- Primary repository for NYC DCP's Data Engineering team☆40Updated this week
- Full stack data engineering tools and infrastructure set-up☆58Feb 13, 2021Updated 5 years ago
- Simple demo for Databricks!☆14Sep 11, 2023Updated 2 years ago
- 1-Click AI Models by DigitalOcean Gradient • AdDeploy popular AI models on DigitalOcean Gradient GPU virtual machines with just a single click. Zero configuration with optimized deployments.
- ☆26Mar 31, 2025Updated last year
- ☆32Aug 13, 2018Updated 7 years ago
- ☆11Jan 9, 2022Updated 4 years ago
- Code for blog at: https://www.startdataengineering.com/post/docker-for-de/☆40Apr 29, 2024Updated 2 years ago
- Collection of AWS Lambda functions in Python☆11Mar 13, 2019Updated 7 years ago
- Automated and tool agnostic data integration testing tool.☆10Mar 29, 2022Updated 4 years ago
- This repo contains implementation of various functionalities of various message queues in Python.☆13Aug 13, 2020Updated 5 years ago
- Data Agents are intelligent assistants built by data engineers to help non-data professionals navigate the organization’s data infrastruc…☆23Apr 14, 2025Updated last year
- Cost Efficient Data Pipelines with DuckDB☆61May 14, 2025Updated last year
- Deploy to Railway using AI coding agents - Free Credits Offer • AdUse Claude Code, Codex, OpenCode, and more. Autonomous software development now has the infrastructure to match with Railway.
- Birgitta is a Python ETL test and schema framework, providing automated tests for pyspark notebooks/recipes.☆14Nov 9, 2023Updated 2 years ago
- An Airflow pipeline for the collection of historical Twitter data☆10Aug 5, 2019Updated 6 years ago
- ☆12Apr 30, 2024Updated 2 years ago
- Due to lack of resources on how to deploy kafka with simple SASL authentication (just username and password) and how to write producer an…☆12Dec 29, 2021Updated 4 years ago
- Tool to deploy python virtualenvs☆13Mar 24, 2025Updated last year
- A cookiecutter template for creating Django IDAs quickly☆11May 21, 2020Updated 6 years ago
- A simple web application for arranging 'chats over coffee'.☆13Jul 19, 2023Updated 2 years ago