apache-spark-on-k8s / spark
Apache Spark enhanced with native Kubernetes scheduler back-end: NOTE this repository is being ARCHIVED as all new development for the kubernetes scheduler back-end is now on https://github.com/apache/spark/
☆612Updated 5 years ago
Alternatives and similar repositories for spark:
Users that are interested in spark are comparing it to the libraries listed below
- Repository holding configuration files for running an HDFS cluster in Kubernetes☆396Updated 6 months ago
- Running YARN on Kubernetes with PetSet controller.☆166Updated 7 years ago
- [DEPRECATED] Kubernetes operator for managing the lifecycle of Apache Flink and Beam applications.☆658Updated 2 years ago
- A Kubernetes Scheduler Extender to provide gang scheduling support for Spark on Kubernetes☆176Updated last year
- Scripts for generating Grafana dashboards for monitoring Spark jobs☆242Updated 10 years ago
- Kubernetes operator for managing the lifecycle of Apache Spark applications on Kubernetes.☆2,898Updated this week
- A Kafka Operator for Kubernetes☆294Updated 6 years ago
- Docker build for Apache Spark☆673Updated 3 years ago
- Livy is an open source REST interface for interacting with Apache Spark from anywhere☆1,007Updated 2 years ago
- Kubernetes operator that provides control plane for managing Apache Flink applications☆570Updated 7 months ago
- Benchmarks for Low Latency (Streaming) solutions including Apache Storm, Apache Spark, Apache Flink, ...☆639Updated last year
- Mirror of Apache Bahir☆336Updated last year
- A tool for monitoring and tuning Spark jobs for efficiency.☆357Updated 2 years ago
- Docker image with Ambari☆290Updated 7 years ago
- Spark metrics related custom classes and sinks (e.g. Prometheus)☆180Updated 2 years ago
- LinkedIn's previous generation Kafka to HDFS pipeline.☆876Updated 4 years ago
- Apache Livy is an open source REST interface for interacting with Apache Spark from anywhere.☆909Updated this week
- Kubernetes custom controller and CRDs to managing Airflow☆299Updated 4 years ago
- Apache Kafka on Apache Mesos☆412Updated 6 years ago
- Kite SDK☆392Updated 2 years ago
- CDP Public Cloud is an integrated analytics and data management platform deployed on cloud services. It offers broad data analytics and a…☆357Updated this week
- Dr. Elephant is a job and flow-level performance monitoring and tuning tool for Apache Hadoop and Apache Spark☆1,354Updated last year
- Operator for managing the Spark clusters on Kubernetes and OpenShift.☆158Updated 3 years ago
- HBase running in Docker☆331Updated 2 years ago
- Lightweight proxy to expose the UI of an Apache Spark cluster that is behind a firewall☆98Updated 4 years ago
- Tranquility helps you send real-time event streams to Druid and handles partitioning, replication, service discovery, and schema rollover…☆516Updated 5 years ago
- Docker packaging for Apache Flink☆139Updated 5 years ago
- DC/OS SDK is a collection of tools, libraries, and documentation for easy integration of technologies such as Kafka, Cassandra, HDFS, Spa…☆156Updated 6 months ago
- Project SnappyData - memory optimized analytics database, based on Apache Spark™ and Apache Geode™. Stream, Transact, Analyze, Predict in…☆1,037Updated 2 years ago
- An Integrated and collaborative cloud environment for building and running Spark applications on PKS/Kubernetes☆83Updated 5 years ago