ssp-data / data-engineering-devops
Full stack data engineering tools and infrastructure set-up
☆51Updated 4 years ago
Alternatives and similar repositories for data-engineering-devops:
Users that are interested in data-engineering-devops are comparing it to the libraries listed below
- Code for my "Efficient Data Processing in SQL" book.☆56Updated 8 months ago
- A modern ELT demo using airbyte, dbt, snowflake and dagster☆27Updated 2 years ago
- Cost Efficient Data Pipelines with DuckDB☆51Updated 8 months ago
- A simple and easy to use Data Quality (DQ) tool built with Python.☆50Updated last year
- Data-aware orchestration with dagster, dbt, and airbyte☆31Updated 2 years ago
- Open Data Stack Projects: Examples of End to End Data Engineering Projects☆80Updated last year
- ☆17Updated 8 months ago
- Delta-Lake, ETL, Spark, Airflow☆47Updated 2 years ago
- Code snippets for Data Engineering Design Patterns book☆80Updated last month
- A portable Datamart and Business Intelligence suite built with Docker, sqlmesh + dbtcore, DuckDB and Superset☆49Updated 5 months ago
- Utility functions for dbt projects running on Spark☆32Updated 2 months ago
- Template for Data Engineering and Data Pipeline projects☆109Updated 2 years ago
- ☆77Updated 6 months ago
- Delta Lake Documentation☆49Updated 10 months ago
- New generation opensource data stack☆67Updated 2 years ago
- ☆21Updated 3 years ago
- ☆18Updated last year
- Example repo to create end to end tests for data pipeline.☆23Updated 10 months ago
- csv and flat-file sniffer built in Rust.☆42Updated last year
- ☆10Updated 2 years ago
- ☆36Updated last month
- ☆16Updated 11 months ago
- Data pipeline that scrapes Rust cheater Steam profiles☆52Updated 3 years ago
- Yet Another (Spark) ETL Framework☆20Updated last year
- Containerized end-to-end analytics of Spotify data using Python, dbt, Postgres, and Metabase☆126Updated 2 years ago
- Cloned by the `dbt init` task☆61Updated 11 months ago
- Repo for CDC with debezium blog post☆28Updated 7 months ago
- A DataOps framework for building a lakehouse.☆50Updated this week
- A repository of sample code to show data quality checking best practices using Airflow.☆76Updated 2 years ago
- Simplified ETL process in Hadoop using Apache Spark. Has complete ETL pipeline for datalake. SparkSession extensions, DataFrame validatio…☆54Updated last year