mozilla / docker-etlLinks
Collection of dockerized ETL jobs managed by data engineering.
☆20Updated last week
Alternatives and similar repositories for docker-etl
Users that are interested in docker-etl are comparing it to the libraries listed below
Sorting:
- Weekly Data Engineering Newsletter☆96Updated last year
- PySpark schema generator☆43Updated 2 years ago
- Utility functions for dbt projects running on Spark☆33Updated 5 months ago
- Read Delta tables without any Spark☆47Updated last year
- Delta Lake helper methods. No Spark dependency.☆23Updated 10 months ago
- New generation opensource data stack☆70Updated 3 years ago
- locopy: Loading/Unloading to Redshift and Snowflake using Python.☆110Updated last week
- Soda Spark is a PySpark library that helps you with testing your data in Spark Dataframes☆64Updated 3 years ago
- [ARCHIVED] The Presto adapter plugin for dbt Core☆33Updated last year
- Full stack data engineering tools and infrastructure set-up☆54Updated 4 years ago
- A DBT package to perform DataOps & administrative CI/CD on your data warehouse.☆16Updated 4 years ago
- Superglue is a lineage-tracking tool built to help visualize the propagation of data through complex pipelines composed of tables, jobs …☆158Updated 2 years ago
- A Table format agnostic data sharing framework☆38Updated last year
- Data validation library for PySpark 3.0.0☆33Updated 2 years ago
- Data-aware orchestration with dagster, dbt, and airbyte☆30Updated 2 years ago
- Rules based grant management for Snowflake☆40Updated 6 years ago
- JumpSpark - A modern cookiecutter template for pyspark projects with batteries included.☆10Updated 2 years ago
- Delta Lake examples☆227Updated 9 months ago
- Making DAG construction easier☆268Updated 2 weeks ago
- Pythonic Programming Framework to orchestrate jobs in Databricks Workflow☆218Updated last week
- Build your feature store with macros right within your dbt repository☆39Updated 2 years ago
- Fast iterative local development and testing of Apache Airflow workflows☆202Updated 3 months ago
- A simple Spark-powered ETL framework that just works 🍺☆182Updated this week
- a pytest plugin for dbt adapter test suites☆19Updated last year
- Support for generating modern platforms dynamically with services such as Kafka, Spark, Streamsets, HDFS, ....☆75Updated last week
- Parse dbt artifacts and search dbt models with Algolia☆52Updated 4 years ago
- Data Tools Subjective List☆86Updated last year
- Personal Finance Project to automatically collect swiss banking transaction into a DWH and visualise it☆26Updated last year
- Spark app to merge different schemas☆23Updated 4 years ago
- Fake Pandas / PySpark DataFrame creator☆47Updated last year