GoogleCloudDataproc / custom-images
Tools for creating Dataproc custom images
☆33Updated last week
Alternatives and similar repositories for custom-images
Users that are interested in custom-images are comparing it to the libraries listed below
Sorting:
- Oozie Workflow to Airflow DAGs migration tool☆87Updated 2 months ago
- Solution Accelerators for Serverless Spark on GCP, the industry's first auto-scaling and serverless Spark as a service☆69Updated last year
- ☆54Updated 7 years ago
- ☆47Updated last year
- Data validation library for PySpark 3.0.0☆33Updated 2 years ago
- A tool to create Airflow RBAC roles with dag-level permissions from cli.☆13Updated last year
- Cloud Spanner Connector for Apache Spark☆17Updated 4 months ago
- Sample code with integration between Data Catalog and Hive data source.☆25Updated 3 months ago
- Stream Avro SpecificRecord objects in BigQuery using Cloud Dataflow☆13Updated 3 years ago
- A Giter8 template for scio☆31Updated 3 months ago
- An example PySpark project with pytest☆16Updated 7 years ago
- The Internals of Spark on Kubernetes☆71Updated 3 years ago
- Pylint plugin for static code analysis on Airflow code☆94Updated 4 years ago
- Cloud Dataproc: Samples and Utils☆203Updated last month
- Spark pipelines that correspond to a series of Dataflow examples.☆27Updated 6 years ago
- Examples of Spark 3.0☆47Updated 4 years ago
- ☕⛵WIP PySpark dependency management☆22Updated 6 years ago
- Multi Cloud Data Tokenization Solution By Using Dataflow and Cloud DLP☆92Updated 9 months ago
- JSON schema parser for Apache Spark☆81Updated 2 years ago
- Examples using Google Cloud Dataflow - Apache Beam☆35Updated 2 years ago
- Example Spark applications that run on Kubernetes and access GCP products, e.g., GCS, BigQuery, and Cloud PubSub☆37Updated 7 years ago
- Examples for High Performance Spark☆15Updated 6 months ago
- Spark on Kubernetes using Helm☆34Updated 4 years ago
- An example Apache Beam project.☆111Updated 7 years ago
- Fast iterative local development and testing of Apache Airflow workflows☆200Updated 2 weeks ago
- hive_compared_bq compares/validates 2 (SQL like) tables, and graphically shows the rows/columns that are different.☆28Updated 7 years ago
- Dataproc templates and pipelines for solving in-cloud data tasks☆128Updated last month
- ☆128Updated last year
- ☆19Updated 2 months ago
- Magic to help Spark pipelines upgrade☆35Updated 7 months ago