ael-computas / gcp-cloud-composer-pod-operator
Contains example dags and terraform code to create a composer with a node pool to run pods
☆13Updated 4 years ago
Alternatives and similar repositories for gcp-cloud-composer-pod-operator:
Users that are interested in gcp-cloud-composer-pod-operator are comparing it to the libraries listed below
- Sample code with integration between Data Catalog and Hive data source.☆25Updated last month
- Repo with scripts and automation to help ensure best practices in Google Data Catalog☆13Updated 3 years ago
- ☆47Updated 10 months ago
- Debussy is an opinionated Data Architecture and Engineering framework, enabling data analysts and engineers to build better platforms and…☆28Updated 2 years ago
- Composable filesystem hooks and operators for Apache Airflow.☆17Updated 3 years ago
- a pytest plugin for dbt adapter test suites☆19Updated last year
- A pyspark lib to validate data quality☆18Updated 2 years ago
- Data validation library for PySpark 3.0.0☆33Updated 2 years ago
- Utility functions for dbt projects running on Spark☆31Updated last month
- Shed light on your data layout in order to monitor the health of your Lakehouse tables and identify when data maintenance operations shou…☆10Updated last year
- Scalable CDC Pattern Implemented using PySpark☆18Updated 5 years ago
- A template DBT project for BigQuery on Google Cloud☆12Updated 3 years ago
- Examples for High Performance Spark☆15Updated 4 months ago
- A Python package to centralize some Google Cloud Data Catalog scripts, this repo contains commands like bulk CSV operations that help lev…☆22Updated 2 years ago
- Tag Engine automates the process of creating, updating, deleting, and populating metadata in bulk with the Google Cloud services Data Cat…☆53Updated last week
- The sane way of building a data layer in Airflow☆24Updated 5 years ago
- Solution Accelerators for Serverless Spark on GCP, the industry's first auto-scaling and serverless Spark as a service☆66Updated 10 months ago
- Guide on how to setup Apache Airflow containers using Docker and IBM Bluemix☆11Updated 7 years ago
- Update a Google Data Catalog tag with dbt Cloud run metadata☆22Updated 4 years ago
- Demos for Nessie. Nessie provides Git-like capabilities for your Data Lake.☆28Updated last week
- Yet Another (Spark) ETL Framework☆20Updated last year
- ☆11Updated 4 months ago
- Pylint plugin for static code analysis on Airflow code☆93Updated 4 years ago
- Fake Pandas / PySpark DataFrame creator☆46Updated last year
- Sample code with integration between Data Catalog and BI data sources.☆32Updated 3 years ago
- Activity Schema dbt package☆14Updated last year
- Code that was used as an example during the Data+AI Summit 2020☆15Updated 4 years ago
- 📆 Run, schedule, and manage your dbt jobs using Kubernetes.☆24Updated 6 years ago
- CLI for data platform☆19Updated last year