GoogleCloudDataproc / custom-images
Tools for creating Dataproc custom images
☆32Updated 3 weeks ago
Alternatives and similar repositories for custom-images:
Users that are interested in custom-images are comparing it to the libraries listed below
- ☆54Updated 7 years ago
- hive_compared_bq compares/validates 2 (SQL like) tables, and graphically shows the rows/columns that are different.☆28Updated 7 years ago
- Sample code with integration between Data Catalog and Hive data source.☆25Updated last month
- ☆47Updated 10 months ago
- Data validation library for PySpark 3.0.0☆33Updated 2 years ago
- Stream Avro SpecificRecord objects in BigQuery using Cloud Dataflow☆13Updated 3 years ago
- Oozie Workflow to Airflow DAGs migration tool☆88Updated 2 weeks ago
- Quickly get a kubernetes executor airflow environment provisioned on GKE. Azure Kubernetes Service instructions included also as are inst…☆36Updated 4 years ago
- ☆19Updated last week
- Cloud Spanner Connector for Apache Spark☆17Updated 2 months ago
- Apache Beam examples for running on Google Cloud Dataflow.☆30Updated 6 years ago
- BigQuery Schema Conversion Tool☆23Updated 4 years ago
- Example Spark applications that run on Kubernetes and access GCP products, e.g., GCS, BigQuery, and Cloud PubSub☆37Updated 7 years ago
- Automatically discover and tag PII data across BigQuery tables and apply column-level access controls based on confidentiality level.☆52Updated last week
- ☆20Updated 3 years ago
- Solution Accelerators for Serverless Spark on GCP, the industry's first auto-scaling and serverless Spark as a service☆66Updated 10 months ago
- Airflow on Kubernetes Operator☆89Updated 2 years ago
- Reference implementation for real-time Data Lineage tracking for BigQuery using Audit Logs, ZetaSQL and Dataflow.☆142Updated 9 months ago
- The Internals of Spark on Kubernetes☆70Updated 2 years ago
- 📆 Run, schedule, and manage your dbt jobs using Kubernetes.☆24Updated 6 years ago
- Repository with examples and smoke tests for the GCP Airflow operators and hooks☆147Updated 8 years ago
- Cloud Dataproc: Samples and Utils☆200Updated 2 months ago
- Magic to help Spark pipelines upgrade☆34Updated 5 months ago
- Airflow workflow management platform chef cookbook.☆71Updated 5 years ago
- Dataproc templates and pipelines for solving simple in-cloud data tasks☆124Updated last week
- Data Catalog Tag Templates☆30Updated 5 months ago
- A curated list of all the awesome examples, articles, tutorials and videos for Apache Airflow.☆96Updated 4 years ago
- A command-line tool for managing permissions and dependencies for BigQuery authorized views☆91Updated 2 years ago
- ☆66Updated 7 months ago
- Sample code with integration between Data Catalog and BI data sources.☆32Updated 3 years ago