mistercrunch / awesome-data-engineeringLinks
A curated list of data engineering tools for software developers
☆502Updated 8 years ago
Alternatives and similar repositories for awesome-data-engineering
Users that are interested in awesome-data-engineering are comparing it to the libraries listed below
Sorting:
- ☆201Updated 2 years ago
- ETL best practices with airflow, with examples☆1,353Updated last year
- Example DAGs using hooks and operators from Airflow Plugins☆348Updated 7 years ago
- Apache Airflow integration for dbt☆411Updated last year
- Airflow Unit Tests and Integration Tests☆261Updated 3 years ago
- Event data simulator. Generates a stream of pseudo-random events from a set of users, designed to simulate web traffic.☆534Updated 2 weeks ago
- Assets related to the operation of Fishtown Analytics.☆419Updated last year
- Learn how to add data validation and documentation to a data pipeline built with dbt and Airflow.☆168Updated 2 years ago
- pyspark methods to enhance developer productivity 📣 👯 🎉☆682Updated 11 months ago
- Airflow basics tutorial☆397Updated 4 years ago
- Data ingestion library for Amundsen to build graph and search index☆204Updated last year
- The easiest way to run Airflow locally, with linting & tests for valid DAGs and Plugins.☆258Updated 4 years ago
- Helm Charts for the Astronomer Platform, Apache Airflow as a Service on Kubernetes☆488Updated last week
- Construct Apache Airflow DAGs Declaratively via YAML configuration files☆1,415Updated this week
- This repository has moved into https://github.com/dbt-labs/dbt-adapters☆443Updated 6 months ago
- Spark style guide☆271Updated last year
- A boilerplate for writing PySpark Jobs☆395Updated 2 years ago
- BigQuery data source for Apache Spark: Read data from BigQuery into DataFrames, write DataFrames into BigQuery tables.☆420Updated last week
- A series of DAGs/Workflows to help maintain the operation of Airflow☆1,766Updated last year
- PySpark test helper methods with beautiful error messages☆752Updated 3 weeks ago
- Airflow training for the crunch conf☆105Updated 7 years ago
- Airflow Backfill UI based plugin for existing / new Airflow environment☆64Updated 5 years ago
- Collection of dbt Tips and Tricks☆399Updated 3 years ago
- Redshift package for dbt (getdbt.com)☆102Updated last year
- A plugin for Apache Airflow that exposes rest end points for the Command Line Interfaces☆326Updated 5 years ago
- Great Expectations Airflow operator☆170Updated 2 weeks ago
- Python API for Deequ☆810Updated 3 weeks ago
- A curated collection of publicly available resources on dbt best practices and how data-driven organizations around the world utilize dbt☆115Updated 3 years ago
- A docker image and kubernetes config files to run Airflow on Kubernetes☆655Updated 6 years ago
- ☆22Updated 5 years ago