ael-computas / gcp-cloud-composer-pod-operator
Contains example dags and terraform code to create a composer with a node pool to run pods
☆13Updated 4 years ago
Alternatives and similar repositories for gcp-cloud-composer-pod-operator:
Users that are interested in gcp-cloud-composer-pod-operator are comparing it to the libraries listed below
- Debussy is an opinionated Data Architecture and Engineering framework, enabling data analysts and engineers to build better platforms and…☆28Updated last year
- Repo with scripts and automation to help ensure best practices in Google Data Catalog☆13Updated 3 years ago
- a pytest plugin for dbt adapter test suites☆19Updated last year
- Composable filesystem hooks and operators for Apache Airflow.☆17Updated 3 years ago
- Sample code with integration between Data Catalog and Hive data source.☆25Updated 3 weeks ago
- ☆24Updated 4 years ago
- ☆46Updated 9 months ago
- Examples for High Performance Spark☆15Updated 3 months ago
- Amundsen Gremlin☆21Updated 2 years ago
- The sane way of building a data layer in Airflow☆24Updated 5 years ago
- Spark functions to run popular phonetic and string matching algorithms☆60Updated 2 years ago
- Scalable CDC Pattern Implemented using PySpark☆18Updated 5 years ago
- Shed light on your data layout in order to monitor the health of your Lakehouse tables and identify when data maintenance operations shou…☆10Updated last year
- CLI for data platform☆19Updated last year
- Data validation library for PySpark 3.0.0☆34Updated 2 years ago
- Library which aim to generate kubernetes yamls templates from an Airflow dag using the Airflow Kuberntes Pod Operator☆10Updated 3 years ago
- An open source library for BigQuery testing.☆14Updated 2 years ago
- Delta reader for the Ray open-source toolkit for building ML applications☆45Updated last year
- A collection of utilities and tools for teams and organizations using dbt☆13Updated last year
- A Giter8 template for scio☆30Updated 2 weeks ago
- BigQuery test kit is a framework written in python that allows you to be more confident in your SQL and check that they are ready to prod…☆52Updated last year
- A Python package to centralize some Google Cloud Data Catalog scripts, this repo contains commands like bulk CSV operations that help lev…☆21Updated 2 years ago
- Declarative text based tool for data analysts and engineers to extract, load, transform and orchestrate their data pipelines.☆79Updated this week
- Guide on how to setup Apache Airflow containers using Docker and IBM Bluemix☆11Updated 7 years ago
- PySpark for ETL jobs including lineage to Apache Atlas in one script via code inspection☆18Updated 8 years ago
- A pyspark lib to validate data quality☆18Updated 2 years ago
- Historical metadata of your data warehouse is a treasure trove to discover not just insights about changing data patterns, but also quali…☆13Updated 3 years ago
- 🐋 Docker image for AWS Glue Spark/Python☆23Updated last year
- The DAMN (Data Assets Metric Navigation) tool extracts and reports metrics about your data assets☆12Updated last month
- JumpSpark - A modern cookiecutter template for pyspark projects with batteries included.☆10Updated last year