Spark Terasort
☆121Apr 21, 2023Updated 2 years ago
Alternatives and similar repositories for spark-terasort
Users that are interested in spark-terasort are comparing it to the libraries listed below
Sorting:
- Mirror of Apache Spark☆10Dec 23, 2022Updated 3 years ago
- Splash, a flexible Spark shuffle manager that supports user-defined storage backends for shuffle data storage and exchange☆130Dec 19, 2024Updated last year
- This is archive of SparkRDMA project. The new repository with RDMA shuffle acceleration for Apache Spark is here: https://github.com/Nvid…☆257May 13, 2019Updated 6 years ago
- Use the TPC-DS benchmark to test Spark SQL performance☆184Apr 27, 2020Updated 5 years ago
- TeraSort for Spark and Flink which uses a range partitioner based on sampling☆22Feb 5, 2016Updated 10 years ago
- All the things about TPC-DS in Apache Spark☆109Jun 15, 2023Updated 2 years ago
- Apache Zeppelin Service for Apache Ambari Service. Installation and management of Zeppelin via Ambari.☆14Jan 23, 2016Updated 10 years ago
- Profiling Spark Applications for Performance Comparison and Diagnosis☆17Nov 11, 2018Updated 7 years ago
- HiBench is a big data benchmark suite.☆1,489Dec 15, 2025Updated 2 months ago
- MapReduce performance testing using teragen and terasort☆18Aug 26, 2021Updated 4 years ago
- Benchmark Suite for Apache Spark☆240Apr 12, 2023Updated 2 years ago
- Ansible playbook for automated HDP 2.x deployment install with Kerberos☆19Sep 8, 2016Updated 9 years ago
- Fast I/O plugins for Spark☆41Dec 14, 2020Updated 5 years ago
- [DEPRECATED] For read-only reference of the ALOJA Big Data Benchmarking platform: includes tools to define and deploy clusters, orchestr…☆23Feb 17, 2021Updated 5 years ago
- Spark cloud integration: tests, cloud committers and more☆20Jan 30, 2025Updated last year
- Client libraries of end users of Apache Kyuubi☆11Jan 10, 2023Updated 3 years ago
- Example project to show how to use Kafka from Spark Streaming with the Confluent schema registry☆11Aug 17, 2016Updated 9 years ago
- Combination of Dockerized Hortonworks projects and other Hadoop ecosystem components☆10Oct 11, 2019Updated 6 years ago
- Various tools to help plan HDP and CDH upgrades to CDP☆14Dec 7, 2021Updated 4 years ago
- A S3 Shuffle plugin for Apache Spark to enable elastic scaling for generic Spark workloads.☆52Sep 17, 2025Updated 5 months ago
- Alerting and monitoring tool for Apache Spark☆23May 20, 2022Updated 3 years ago
- Spark examples☆41May 7, 2024Updated last year
- ☆21Apr 17, 2023Updated 2 years ago
- 剥离的模块,用于查看Spark SQL生成的语法树☆91May 26, 2019Updated 6 years ago
- ☆23May 12, 2018Updated 7 years ago
- Dr. Elephant is a job and flow-level performance monitoring and tuning tool for Apache Hadoop and Apache Spark☆1,371Aug 22, 2023Updated 2 years ago
- Prescriptive Applications over Kite and Hadoop☆12Oct 14, 2015Updated 10 years ago
- Helm Chart for lyft/flinkk8soperator☆11Mar 10, 2020Updated 5 years ago
- HDFS Automatic Snapshot Service for Linux☆11Oct 17, 2016Updated 9 years ago
- Content Data Store (HDFS/HBase)☆13Dec 1, 2016Updated 9 years ago
- Barman module for Puppet☆14Oct 31, 2022Updated 3 years ago
- An R-like GLM package for Apache Spark☆10Aug 6, 2015Updated 10 years ago
- Konzepte von Core-Java 8 werden durch beispiele illustriert. Java 8's core concepts are explained by examples.☆12Oct 12, 2018Updated 7 years ago
- A disaggregated memory orchestration system that virtualizes cluster wide memory to scale data intensive, large memory workloads in virtu…☆13Apr 26, 2019Updated 6 years ago
- Manages Git identities, including SSH keys☆11Aug 24, 2023Updated 2 years ago
- Superseded by FabSim3.☆14Jun 11, 2019Updated 6 years ago
- ESPBench - The Enterprise Stream Processing Benchmark☆15Dec 27, 2023Updated 2 years ago
- Spark Tutorial at the University of Maryland☆38Oct 24, 2014Updated 11 years ago
- Scripts to analyze Spark's performance☆136May 20, 2018Updated 7 years ago