Example Spark applications that run on Kubernetes and access GCP products, e.g., GCS, BigQuery, and Cloud PubSub
☆37Feb 13, 2018Updated 8 years ago
Alternatives and similar repositories for spark-on-k8s-gcp-examples
Users that are interested in spark-on-k8s-gcp-examples are comparing it to the libraries listed below
Sorting:
- Kubernetes deployment of PrestoDB, Hive Metastore, and Minio S3-standard object store☆17Oct 20, 2022Updated 3 years ago
- Combination of Dockerized Hortonworks projects and other Hadoop ecosystem components☆10Oct 11, 2019Updated 6 years ago
- Highly configurable Helm Presto Chart☆24Nov 13, 2019Updated 6 years ago
- Konzepte von Core-Java 8 werden durch beispiele illustriert. Java 8's core concepts are explained by examples.☆12Oct 12, 2018Updated 7 years ago
- HDFS Automatic Snapshot Service for Linux☆11Oct 17, 2016Updated 9 years ago
- Prescriptive Applications over Kite and Hadoop☆12Oct 14, 2015Updated 10 years ago
- Content Data Store (HDFS/HBase)☆13Dec 1, 2016Updated 9 years ago
- Compose minio + kafka for bucket notifications☆14Jun 16, 2021Updated 4 years ago
- A curated list of awesome PrestoDB / Trino software, libraries, tools and resources☆18Jun 28, 2021Updated 4 years ago
- Hadoop YARN & MapReduce Memory Calculator☆13Nov 9, 2015Updated 10 years ago
- Go Client for Hive Metastore☆14Dec 18, 2022Updated 3 years ago
- Repository for batch predict☆17Dec 1, 2021Updated 4 years ago
- Docker packaging for Apache Flink☆139Feb 4, 2020Updated 6 years ago
- Apache Zeppelin Service for Apache Ambari Service. Installation and management of Zeppelin via Ambari.☆14Jan 23, 2016Updated 10 years ago
- Hadoop Data Pipeline using Falcon☆15May 3, 2016Updated 9 years ago
- Hive Storage Handler for interoperability between BigQuery and Apache Hive☆19Jan 29, 2025Updated last year
- A complete custom processor project, for your reference.☆17Sep 29, 2015Updated 10 years ago
- Presto cluster on top of kubernetes☆18Jul 7, 2019Updated 6 years ago
- kafka-manager in Docker container☆19Dec 23, 2020Updated 5 years ago
- Uses Cloud Build to deploy a scalable batch ingestion pipeline consisting of GCS, Cloud Functions, Dataflow and BigQuery☆22Dec 7, 2022Updated 3 years ago
- Apache iceberg Spark s3 examples☆21Mar 1, 2024Updated 2 years ago
- Examples for how to use the Flink Docker images in a variety of ways☆91Oct 12, 2021Updated 4 years ago
- Scripts for k8s scalability testing and analysis☆23Jan 4, 2018Updated 8 years ago
- GlusterFS plugin for Hadoop HCFS☆69Apr 12, 2022Updated 3 years ago
- springboot kafka application in a Kubernetes cluster that is relased through helm☆10Feb 6, 2024Updated 2 years ago
- Flink image for Kubernetes that fixes Jobmanage connection issue☆26Jul 31, 2018Updated 7 years ago
- Big Data Technology Index☆25Dec 18, 2019Updated 6 years ago
- Convert a CSV fle to ORCFile☆26Apr 10, 2019Updated 6 years ago
- Embed any webapp/website as Ambari view!☆25Feb 26, 2016Updated 10 years ago
- Spark Tutorial at the University of Maryland☆38Oct 24, 2014Updated 11 years ago
- Anonymizing Library for Apache Spark☆31Nov 9, 2023Updated 2 years ago
- Adding certificates to the Docker for Mac beta☆28Nov 30, 2016Updated 9 years ago
- Boilerplate project for MOTW Workshop 2015☆10Mar 3, 2016Updated 10 years ago
- Docker images for Presto integration testing☆35Feb 23, 2026Updated 2 weeks ago
- BigQuery data source for Apache Spark: Read data from BigQuery into DataFrames, write DataFrames into BigQuery tables.☆421Mar 1, 2026Updated last week
- The plugin of Drone CI to integrate with SonarQube (previously called Sonar), which is an open source code quality management platform.☆31Nov 16, 2022Updated 3 years ago
- Kubernetes custom controller and CRDs to managing Airflow☆297Jun 25, 2020Updated 5 years ago
- An Integrated and collaborative cloud environment for building and running Spark applications on PKS/Kubernetes☆84Mar 16, 2020Updated 5 years ago
- Uses Twarc 2 to access Twitter's archive via the API 2.0. Collects, processes and pushes Tweets to a specified Google BigQuery dataset. R…☆12Apr 5, 2023Updated 2 years ago