GoogleCloudPlatform / spark-on-k8s-gcp-examples
Example Spark applications that run on Kubernetes and access GCP products, e.g., GCS, BigQuery, and Cloud PubSub
☆37Updated 7 years ago
Alternatives and similar repositories for spark-on-k8s-gcp-examples:
Users that are interested in spark-on-k8s-gcp-examples are comparing it to the libraries listed below
- An example Apache Beam project.☆111Updated 7 years ago
- ☆81Updated last year
- Cloud Spanner Connector for Apache Spark☆17Updated last month
- Oozie Workflow to Airflow DAGs migration tool☆88Updated last month
- Circus Train is a dataset replication tool that copies Hive tables between clusters and clouds.☆88Updated 11 months ago
- Ephemeral Hadoop clusters using Google Compute Platform☆135Updated 2 years ago
- An Integrated and collaborative cloud environment for building and running Spark applications on PKS/Kubernetes☆82Updated 4 years ago
- An Operator for scheduling and executing NiFi Flows as Jobs on Kubernetes☆53Updated 4 years ago
- Airflow on Kubernetes Operator☆89Updated 2 years ago
- A Spark metrics sink that pushes to InfluxDb☆51Updated 4 years ago
- Export Airflow metrics (from mysql) in prometheus format☆29Updated 2 weeks ago
- Docker Image and Kubernetes Configurations for Spark 2.x☆41Updated 5 years ago
- ☆13Updated last week
- Sample code with integration between Data Catalog and Hive data source.☆25Updated 3 weeks ago
- Spark on Kubernetes infrastructure Docker images repo☆37Updated 2 years ago
- Highly configurable Helm Presto Chart☆24Updated 5 years ago
- Cloud Dataproc: Samples and Utils☆200Updated last month
- ☆54Updated 7 years ago
- The Internals of Spark on Kubernetes☆70Updated 2 years ago
- File compaction tool that runs on top of the Spark framework.☆59Updated 5 years ago
- type-class based data cleansing library for Apache Spark SQL☆79Updated 5 years ago
- Spark metrics related custom classes and sinks (e.g. Prometheus)☆177Updated 2 years ago
- Get started with Apache Beam and Flink☆42Updated 8 years ago
- AMQP data source for dstream (Spark Streaming)☆26Updated 2 years ago
- Lightweight proxy to expose the UI of an Apache Spark cluster that is behind a firewall☆98Updated 4 years ago
- Multi Cloud Data Tokenization Solution By Using Dataflow and Cloud DLP☆90Updated 6 months ago
- Examples of Spark 3.0☆46Updated 4 years ago
- Ansible playbooks for Apache Spark on kube☆27Updated 7 years ago
- Docker image for Spark history server on Kubernetes☆15Updated 4 years ago
- Setup for running Trino with Hive Metastore on Kubernetes☆99Updated 2 years ago