GoogleCloudDataproc / custom-images
Tools for creating Dataproc custom images
☆32Updated last week
Alternatives and similar repositories for custom-images:
Users that are interested in custom-images are comparing it to the libraries listed below
- ☆54Updated 7 years ago
- ☆47Updated 11 months ago
- Example Spark applications that run on Kubernetes and access GCP products, e.g., GCS, BigQuery, and Cloud PubSub☆37Updated 7 years ago
- A tool to create Airflow RBAC roles with dag-level permissions from cli.☆13Updated last year
- Solution Accelerators for Serverless Spark on GCP, the industry's first auto-scaling and serverless Spark as a service☆68Updated 11 months ago
- Data validation library for PySpark 3.0.0☆33Updated 2 years ago
- Code Repository for the EVO-ODAS☆31Updated 7 years ago
- Cloud Spanner Connector for Apache Spark☆17Updated 3 months ago
- hive_compared_bq compares/validates 2 (SQL like) tables, and graphically shows the rows/columns that are different.☆28Updated 7 years ago
- Apache Beam examples for running on Google Cloud Dataflow.☆30Updated 6 years ago
- A pyspark lib to validate data quality☆18Updated 2 years ago
- An example PySpark project with pytest☆16Updated 7 years ago
- Real-world Spark pipelines examples☆83Updated 7 years ago
- Cloud Dataproc: Samples and Utils☆203Updated 3 weeks ago
- Spark on Kubernetes using Helm☆34Updated 4 years ago
- Oozie Workflow to Airflow DAGs migration tool☆87Updated last month
- Sample code with integration between Data Catalog and Hive data source.☆25Updated 2 months ago
- Pylint plugin for static code analysis on Airflow code☆93Updated 4 years ago
- An example Apache Beam project.☆111Updated 7 years ago
- Repository with examples and smoke tests for the GCP Airflow operators and hooks☆147Updated 8 years ago
- Sample code with integration between Data Catalog and BI data sources.☆32Updated 3 years ago
- ☆19Updated last month
- Fast iterative local development and testing of Apache Airflow workflows☆200Updated last week
- This library has moved to https://github.com/googleapis/google-cloud-python/tree/main/packages/google-cloud-dataproc☆48Updated last year
- [ARCHIVED] Moved to github.com/NVIDIA/spark-xgboost-examples☆70Updated 4 years ago
- A Getting Started Guide for developing and using Airflow Plugins☆93Updated 6 years ago
- ☆37Updated 5 years ago
- Stream Avro SpecificRecord objects in BigQuery using Cloud Dataflow☆13Updated 3 years ago
- Trino dbt demo project to mix and load BigQuery data with and in a local PostgreSQL database☆74Updated 3 years ago
- Airflow workflow management platform chef cookbook.☆71Updated 5 years ago