The Internals of Spark on Kubernetes
☆73May 9, 2022Updated 3 years ago
Alternatives and similar repositories for spark-kubernetes-book
Users that are interested in spark-kubernetes-book are comparing it to the libraries listed below
Sorting:
- The Internals of Delta Lake☆188Nov 30, 2025Updated 3 months ago
- The Internals of Spark SQL☆486Jan 25, 2026Updated last month
- Local Development of AWS Glue with Docker and Visual Studio Code☆14Nov 29, 2021Updated 4 years ago
- Infra stuff to run Kubernetes on travisci☆10Mar 7, 2023Updated 3 years ago
- This construct builds some elements for you to quickly launch an EMR Serverless application. After submitting the Emr Serverless job, you…☆11Nov 18, 2025Updated 3 months ago
- Provide functionality to build statistical models to repair dirty tabular data in Spark☆12Apr 21, 2023Updated 2 years ago
- ☆18Nov 4, 2024Updated last year
- Best practices and recommendations for getting started with Amazon EMR on EKS.☆68Jan 27, 2026Updated last month
- Scrapy exporter for Big Data formats☆16Feb 27, 2026Updated last week
- ☆17Feb 16, 2020Updated 6 years ago
- Trino connectors for accessing APIs with an OpenAPI spec☆43Feb 9, 2026Updated last month
- Docker image for Spark history server on Kubernetes☆15Mar 13, 2020Updated 5 years ago
- Ranger Hive Metastore Plugin☆18Jul 21, 2023Updated 2 years ago
- Testing Sandbox for Hadoop Ecosystem Components☆44Feb 25, 2026Updated last week
- Spark on Kubernetes infrastructure Docker images repo☆37Oct 20, 2022Updated 3 years ago
- Examples for High Performance Spark☆16Oct 25, 2025Updated 4 months ago
- An alternative to the "hive standalone" jar for connecting Java applications to Apache Hive via JDBC☆41Oct 1, 2024Updated last year
- Kubernetes operator for managing the lifecycle of Apache Spark applications on Kubernetes.☆3,109Updated this week
- Apache Ranger Plugin for S3☆20Nov 30, 2022Updated 3 years ago
- Rocksdb state storage implementation for Structured Streaming.☆17Oct 21, 2020Updated 5 years ago
- SBT project showing shading a library with SBT assembly☆15Oct 4, 2018Updated 7 years ago
- 最简单的 spark sql on kubernetes 生产环境部署方案☆19Jun 12, 2023Updated 2 years ago
- A K8s-based infrastructure for analytics☆24Jan 15, 2020Updated 6 years ago
- Run AllegroGraph in a Docker container☆21May 31, 2025Updated 9 months ago
- Code and examples of how to write and deploy Apache Spark Plugins. Spark plugins allow runnig custom code on the executors as they are in…☆94May 9, 2025Updated 10 months ago
- A tool to validate data, built around Apache Spark.☆101Feb 19, 2026Updated 2 weeks ago
- The Internals of Apache Spark☆1,541Jul 5, 2025Updated 8 months ago
- Spark and Delta Lake Workshop☆22Jun 14, 2022Updated 3 years ago
- a demo project to Analyze most popular twitter hashtags using Java 8 Spring-Boot Spark Streaming Kafka & Docker Demo.☆22Feb 16, 2026Updated 3 weeks ago
- Spark SQL listener to record lineage information☆28Jan 24, 2021Updated 5 years ago
- ☆25Mar 15, 2024Updated last year
- Serializable ACID transactions on streaming data☆25Oct 21, 2022Updated 3 years ago
- Operator for managing the Spark clusters on Kubernetes and OpenShift.☆158Nov 18, 2021Updated 4 years ago
- Open source stack lakehouse☆25Mar 2, 2024Updated 2 years ago
- Data Engineering with Scala, published by Packt☆28Mar 2, 2026Updated last week
- The open source version of the Amazon EMR Management Guide. You can submit feedback & requests for changes by submitting issues in this r…☆62Jun 15, 2023Updated 2 years ago
- An open source indexing subsystem that brings index-based query acceleration to Apache Spark™ and big data workloads.☆431Jan 14, 2022Updated 4 years ago
- The mm-ADT Virtual Machine☆35Nov 22, 2020Updated 5 years ago
- The Internals of Spark Structured Streaming☆422Mar 2, 2026Updated last week