GoogleCloudPlatform / spark-on-k8s-gcp-examples
Example Spark applications that run on Kubernetes and access GCP products, e.g., GCS, BigQuery, and Cloud PubSub
☆36Updated 6 years ago
Related projects ⓘ
Alternatives and complementary repositories for spark-on-k8s-gcp-examples
- ☆81Updated last year
- Cloud Spanner Connector for Apache Spark☆17Updated last week
- Oozie Workflow to Airflow DAGs migration tool☆87Updated 3 weeks ago
- An example Apache Beam project.☆111Updated 7 years ago
- An Integrated and collaborative cloud environment for building and running Spark applications on PKS/Kubernetes☆81Updated 4 years ago
- ☆13Updated last week
- Ephemeral Hadoop clusters using Google Compute Platform☆134Updated 2 years ago
- Airflow on Kubernetes Operator☆89Updated last year
- Paper: A Zero-rename committer for object stores☆20Updated 3 years ago
- ☆31Updated 6 years ago
- Highly configurable Helm Presto Chart☆24Updated 5 years ago
- Google BigQuery support for Spark, Structured Streaming, SQL, and DataFrames with easy Databricks integration.☆70Updated last year
- Rokku project. This project acts as a proxy on top of any S3 storage solution providing services like authentication, authorization, shor…☆66Updated 8 months ago
- Docker images for Presto integration testing☆35Updated 5 months ago
- Export Airflow metrics (from mysql) in prometheus format☆29Updated 2 years ago
- In-deprecation. For Lenses please check lensesio/lenses-helm-charts. Soon Stream Reactor will also get its own Helm repository.☆70Updated 4 years ago
- Ansible playbooks for Apache Spark on kube☆27Updated 7 years ago
- ☆54Updated 7 years ago
- Spark on Kubernetes infrastructure Docker images repo☆37Updated 2 years ago
- Bulletproof Apache Spark jobs with fast root cause analysis of failures.☆72Updated 3 years ago
- ☆63Updated 5 years ago
- Spark cloud integration: tests, cloud committers and more☆19Updated 8 months ago
- Scalable CDC Pattern Implemented using PySpark☆18Updated 5 years ago
- Cask Hydrator Plugins Repository☆67Updated 3 weeks ago
- An Operator for scheduling and executing NiFi Flows as Jobs on Kubernetes☆53Updated 4 years ago
- Sample code with integration between Data Catalog and Hive data source.☆25Updated 6 months ago
- kafka-connect-s3 : Ingest data from Kafka to Object Stores(s3)☆95Updated 5 years ago
- Spark metrics related custom classes and sinks (e.g. Prometheus)☆176Updated 2 years ago
- Circus Train is a dataset replication tool that copies Hive tables between clusters and clouds.☆86Updated 8 months ago
- Support Highcharts in Apache Zeppelin☆81Updated 7 years ago