mozilla / docker-etlLinks
Collection of dockerized ETL jobs managed by data engineering.
☆21Updated this week
Alternatives and similar repositories for docker-etl
Users that are interested in docker-etl are comparing it to the libraries listed below
Sorting:
- Weekly Data Engineering Newsletter☆96Updated last year
- Utility functions for dbt projects running on Spark☆33Updated 8 months ago
- Airflow configuration for Telemetry☆195Updated last week
- PySpark schema generator☆43Updated 2 years ago
- locopy: Loading/Unloading to Redshift and Snowflake using Python.☆113Updated 2 months ago
- Great Expectations Airflow operator☆167Updated last week
- Astronomer Core Docker Images☆106Updated last year
- Learn how to add data validation and documentation to a data pipeline built with dbt and Airflow.☆169Updated 2 years ago
- A repository of sample code to show data quality checking best practices using Airflow.☆78Updated 2 years ago
- ☆49Updated 8 months ago
- JumpSpark - A modern cookiecutter template for pyspark projects with batteries included.☆10Updated 2 years ago
- Fast iterative local development and testing of Apache Airflow workflows☆201Updated 2 months ago
- Read Delta tables without any Spark☆47Updated last year
- Soda Spark is a PySpark library that helps you with testing your data in Spark Dataframes☆64Updated 3 years ago
- Data-aware orchestration with dagster, dbt, and airbyte☆30Updated 2 years ago
- Unity Catalog UI☆43Updated last year
- Delta Lake examples☆230Updated last year
- Data validation library for PySpark 3.0.0☆33Updated 2 years ago
- DuckDB with Dashboarding tools demo evidence, streamlit and rill☆21Updated last year
- [ARCHIVED] The Presto adapter plugin for dbt Core☆33Updated last year
- Data processing and modelling framework for automating tasks (incl. Python & SQL transformations).☆120Updated last month
- Automatically discover and tag PII data across BigQuery tables and apply column-level access controls based on confidentiality level.☆60Updated last week
- Soda SQL and Soda Spark have been deprecated and replaced by Soda Core. docs.soda.io/soda-core/overview.html☆62Updated 2 years ago
- Airflow Providers containing Deferrable Operators & Sensors from Astronomer☆149Updated last week
- Bigquery ETL☆324Updated this week
- New generation opensource data stack☆74Updated 3 years ago
- Solution Accelerators for Serverless Spark on GCP, the industry's first auto-scaling and serverless Spark as a service☆74Updated last year
- The go to demo for public and private dbt Learn☆80Updated 6 months ago
- A DBT package to perform DataOps & administrative CI/CD on your data warehouse.☆16Updated 4 years ago
- A DataOps framework for building a lakehouse.☆53Updated last week