CODAIT/spark-bench

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/CODAIT/spark-bench)

CODAIT / spark-bench

Benchmark Suite for Apache Spark

☆242

Alternatives and similar repositories for spark-bench

Users that are interested in spark-bench are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

databricks / spark-perf
View on GitHub
Performance tests for Apache Spark
☆392Jul 9, 2018Updated 8 years ago
Intel-bigdata / HiBench
View on GitHub
HiBench is a big data benchmark suite.
☆1,484Dec 15, 2025Updated 7 months ago
databricks / spark-sql-perf
View on GitHub
☆623Feb 26, 2022Updated 4 years ago
kayousterhout / trace-analysis
View on GitHub
Scripts to analyze Spark's performance
☆136May 20, 2018Updated 8 years ago
CODAIT / spark-ref-architecture
View on GitHub
Reference Architectures for Apache Spark
☆38Jan 23, 2017Updated 9 years ago
1-Click AI Models by DigitalOcean Gradient • Ad
Deploy popular AI models on DigitalOcean Gradient GPU virtual machines with just a single click. Zero configuration with optimized deployments.
IBM / spark-tpc-ds-performance-test
View on GitHub
Use the TPC-DS benchmark to test Spark SQL performance
☆186Apr 27, 2020Updated 6 years ago
LucaCanali / sparkMeasure
View on GitHub
This repository contains the development code for sparkMeasure, an Apache Spark performance analysis and troubleshooting library. It simp…
☆827May 19, 2026Updated 2 months ago
NetSys / spark-monotasks
View on GitHub
Fast, predictable data analytics based on (and API-compatible with) Apache Spark
☆26Oct 28, 2017Updated 8 years ago
hammerlab / grafana-spark-dashboards
View on GitHub
Scripts for generating Grafana dashboards for monitoring Spark jobs
☆240Mar 26, 2015Updated 11 years ago
JerryLead / SparkProfiler
View on GitHub
Profiling Spark Applications for Performance Comparison and Diagnosis
☆16Nov 11, 2018Updated 7 years ago
ehiggs / spark-terasort
View on GitHub
Spark Terasort
☆121Apr 21, 2023Updated 3 years ago
BBVA / spark-benchmarks
View on GitHub
Benchmarking suite for Apache Spark
☆16Nov 24, 2017Updated 8 years ago
DSC-SPIDAL / harp
View on GitHub
A collective communication library plugined into Hadoop
☆23Apr 12, 2022Updated 4 years ago
linkedin / dr-elephant
View on GitHub
Dr. Elephant is a job and flow-level performance monitoring and tuning tool for Apache Hadoop and Apache Spark
☆1,370Aug 22, 2023Updated 2 years ago
Deploy open-source AI quickly and easily - Special Bonus Offer • Ad
Runpod Hub is built for open source. One-click deployment and autoscaling endpoints without provisioning your own infrastructure.
databricks / benchmarks
View on GitHub
A place in which we publish scripts for reproducible benchmarks.
☆105Dec 13, 2019Updated 6 years ago
yahoo / streaming-benchmarks
View on GitHub
Benchmarks for Low Latency (Streaming) solutions including Apache Storm, Apache Spark, Apache Flink, ...
☆647Dec 17, 2023Updated 2 years ago
rxin / TPC-H-Hive
View on GitHub
Running TPC-H on Apache Hive
☆41Jul 15, 2019Updated 7 years ago
maropu / spark-tpcds-datagen
View on GitHub
All the things about TPC-DS in Apache Spark
☆111Jun 15, 2023Updated 3 years ago
holdenk / spark-testing-base
View on GitHub
Base classes to use when writing tests with Spark
☆1,553Apr 20, 2026Updated 3 months ago
yangqiang / BigDataBench-Spark
View on GitHub
BigDataBench Spark workloads
☆11Jul 15, 2016Updated 10 years ago
japila-books / apache-spark-internals
View on GitHub
The Internals of Apache Spark
☆1,547Jul 18, 2026Updated last week
JonathanMace / tpcds
View on GitHub
TPC-DS benchmarks including data generation with Spark and queries with Spark
☆15May 8, 2017Updated 9 years ago
oxhead / scout
View on GitHub
Large-scale performance data of Hadoop and Spark on AWS
☆19May 24, 2018Updated 8 years ago
Open source password manager - Proton Pass • Ad
Securely store, share, and autofill your credentials with Proton Pass, the end-to-end encrypted password manager trusted by millions.
amplab / keystone
View on GitHub
Simplifying robust end-to-end machine learning on Apache Spark.
☆473Apr 18, 2017Updated 9 years ago
jackkolokasis / teraheap
View on GitHub
TeraHeap: Reducing Memory Pressure in Managed Big Data Frameworks
☆28Jul 17, 2025Updated last year
amplab / benchmark
View on GitHub
Large scale query engine benchmark
☆99Apr 5, 2016Updated 10 years ago
ibm-watson-data-lab / spark.samples
View on GitHub
tutorials and samples that show you how get the most out of IBM Analytics for Apache Spark
☆78Mar 16, 2018Updated 8 years ago
JerryLead / SparkInternals
View on GitHub
Notes talking about the design and implementation of Apache Spark
☆5,361Apr 2, 2024Updated 2 years ago
odpi / specs
View on GitHub
ODPi specifications, developed by ODPi Runtime and ODPi Operations projects. Currently in Emeritus status
☆35Feb 12, 2019Updated 7 years ago
ReactiveDesignPatterns / website
View on GitHub
website source for Reactive Design Patterns
☆12Oct 16, 2022Updated 3 years ago
ssavvides / tpch-spark
View on GitHub
TPC-H queries in Apache Spark SQL using native DataFrames API
☆99Jan 24, 2024Updated 2 years ago
zrlio / crail-spark-io
View on GitHub
Fast I/O plugins for Spark
☆42Dec 14, 2020Updated 5 years ago
Proton VPN Special Offer - Get 70% off • Ad
Special partner offer. Trusted by over 100 million users worldwide. Tested, Approved and Recommended by Experts.
CODAIT / stocator
View on GitHub
Stocator is high performing connector to object storage for Apache Spark, achieving performance by leveraging object storage semantics.
☆115May 17, 2024Updated 2 years ago
aws-samples / emr-spark-benchmark
View on GitHub
☆26Apr 26, 2026Updated 3 months ago
squito / spark-memory
View on GitHub
A tool to get better debug info on spark's memory usage
☆42Aug 21, 2019Updated 6 years ago
databricks / reference-apps
View on GitHub
Spark reference applications
☆649Oct 3, 2024Updated last year
Huawei-Spark / Spark-SQL-on-HBase
View on GitHub
Native, optimized access to HBase Data through Spark SQL/Dataframe Interfaces
☆316Apr 12, 2022Updated 4 years ago
databricks / simr
View on GitHub
Spark In MapReduce (SIMR) - launching Spark applications on existing Hadoop MapReduce infrastructure
☆44Mar 9, 2022Updated 4 years ago
lihaoyi / scala-bench
View on GitHub
Some benchmarks of memory and runtime performance of Scala's collections
☆44May 19, 2024Updated 2 years ago