ehiggs / spark-terasortView external linksLinks
Spark Terasort
☆121Apr 21, 2023Updated 2 years ago
Alternatives and similar repositories for spark-terasort
Users that are interested in spark-terasort are comparing it to the libraries listed below
Sorting:
- Mirror of Apache Spark☆10Dec 23, 2022Updated 3 years ago
- Splash, a flexible Spark shuffle manager that supports user-defined storage backends for shuffle data storage and exchange☆130Dec 19, 2024Updated last year
- This is archive of SparkRDMA project. The new repository with RDMA shuffle acceleration for Apache Spark is here: https://github.com/Nvid…☆258May 13, 2019Updated 6 years ago
- Use the TPC-DS benchmark to test Spark SQL performance☆184Apr 27, 2020Updated 5 years ago
- TeraSort for Spark and Flink which uses a range partitioner based on sampling☆22Feb 5, 2016Updated 10 years ago
- All the things about TPC-DS in Apache Spark☆110Jun 15, 2023Updated 2 years ago
- Additional useful algorithms that can be used with spark.☆24Dec 24, 2014Updated 11 years ago
- Apache DataFusion Benchmarks☆24Dec 31, 2025Updated last month
- Apache Zeppelin Service for Apache Ambari Service. Installation and management of Zeppelin via Ambari.☆14Jan 23, 2016Updated 10 years ago
- Profiling Spark Applications for Performance Comparison and Diagnosis☆17Nov 11, 2018Updated 7 years ago
- HiBench is a big data benchmark suite.☆1,489Dec 15, 2025Updated 2 months ago
- MapReduce performance testing using teragen and terasort☆18Aug 26, 2021Updated 4 years ago
- Benchmark Suite for Apache Spark☆241Apr 12, 2023Updated 2 years ago
- Ansible playbook for automated HDP 2.x deployment install with Kerberos☆19Sep 8, 2016Updated 9 years ago
- Fast I/O plugins for Spark☆41Dec 14, 2020Updated 5 years ago
- [DEPRECATED] For read-only reference of the ALOJA Big Data Benchmarking platform: includes tools to define and deploy clusters, orchestr…☆23Feb 17, 2021Updated 4 years ago
- Spark cloud integration: tests, cloud committers and more☆20Jan 30, 2025Updated last year
- ☆28Jun 17, 2025Updated 7 months ago
- A S3 Shuffle plugin for Apache Spark to enable elastic scaling for generic Spark workloads.☆52Sep 17, 2025Updated 4 months ago
- Jasmine "lnishan" Chen's Curriculum Vitae (CV) in Markdown☆10May 23, 2018Updated 7 years ago
- Client libraries of end users of Apache Kyuubi☆11Jan 10, 2023Updated 3 years ago
- k8s.gcr.io/echoserver fork☆13Sep 7, 2021Updated 4 years ago
- Various tools to help plan HDP and CDH upgrades to CDP☆14Dec 7, 2021Updated 4 years ago
- A Study of Database Performance Sensitivity to Experiment Settings☆10May 31, 2022Updated 3 years ago
- Combination of Dockerized Hortonworks projects and other Hadoop ecosystem components☆10Oct 11, 2019Updated 6 years ago
- Alerting and monitoring tool for Apache Spark☆23May 20, 2022Updated 3 years ago
- Spark examples☆41May 7, 2024Updated last year
- ☆21Apr 17, 2023Updated 2 years ago
- 剥离的模块,用于查看Spark SQL生成的语法树☆91May 26, 2019Updated 6 years ago
- Dr. Elephant is a job and flow-level performance monitoring and tuning tool for Apache Hadoop and Apache Spark☆1,371Aug 22, 2023Updated 2 years ago
- An R-like GLM package for Apache Spark☆10Aug 6, 2015Updated 10 years ago
- Content Data Store (HDFS/HBase)☆13Dec 1, 2016Updated 9 years ago
- Helm Chart for lyft/flinkk8soperator☆11Mar 10, 2020Updated 5 years ago
- ☆13Mar 29, 2019Updated 6 years ago
- Barman module for Puppet☆14Oct 31, 2022Updated 3 years ago
- Superseded by FabSim3.☆14Jun 11, 2019Updated 6 years ago
- A disaggregated memory orchestration system that virtualizes cluster wide memory to scale data intensive, large memory workloads in virtu…☆13Apr 26, 2019Updated 6 years ago
- Prescriptive Applications over Kite and Hadoop☆12Oct 14, 2015Updated 10 years ago
- Manages Git identities, including SSH keys☆11Aug 24, 2023Updated 2 years ago