Scripts to analyze Spark's performance
☆136May 20, 2018Updated 7 years ago
Alternatives and similar repositories for trace-analysis
Users that are interested in trace-analysis are comparing it to the libraries listed below
Sorting:
- Information-Agnostic Flow Scheduling for Commodity Data Centers☆16Jul 20, 2016Updated 9 years ago
- Benchmark Suite for Apache Spark☆240Apr 12, 2023Updated 2 years ago
- Scripts for generating Grafana dashboards for monitoring Spark jobs☆242Mar 26, 2015Updated 10 years ago
- Elastic Sentiment Analysis (using Apache Mesos, Marathon and Apache Spark)☆35Mar 16, 2015Updated 10 years ago
- TeraSort for Spark and Flink which uses a range partitioner based on sampling☆22Feb 5, 2016Updated 10 years ago
- Simplifying robust end-to-end machine learning on Apache Spark.☆475Apr 18, 2017Updated 8 years ago
- Aalo: Efficient Non-Clairvoyant Coflow Scheduler☆13Nov 22, 2015Updated 10 years ago
- Fluent Scala DSL for Google's Cloud Dataflow SDK☆56Aug 2, 2015Updated 10 years ago
- Low level integration of Spark and Kafka☆130Mar 15, 2018Updated 7 years ago
- Additional useful algorithms that can be used with spark.☆24Dec 24, 2014Updated 11 years ago
- A tool for running Spark on Google Compute Engine☆16Jan 20, 2017Updated 9 years ago
- An efficient updatable key-value store for Apache Spark☆254Mar 11, 2017Updated 8 years ago
- Mesos Integration Tests on Docker/Ec2☆15May 25, 2023Updated 2 years ago
- Implementation based on OSDI paper☆20Feb 11, 2018Updated 8 years ago
- Complete Pipeline Training at Big Data Scala By the Bay☆71Oct 27, 2015Updated 10 years ago
- Benchmarks of BLAS libraries with Scala interface☆30Jan 21, 2016Updated 10 years ago
- GPU Acceleration for Apache Spark☆34Aug 24, 2015Updated 10 years ago
- Automatic offload of user-written Spark kernels to accelerators☆18Oct 25, 2016Updated 9 years ago
- MLeap allows for easily putting Spark ML pipelines into production☆78Oct 27, 2016Updated 9 years ago
- Enabling queries on compressed data.☆282Dec 16, 2023Updated 2 years ago
- Spark Terasort☆121Apr 21, 2023Updated 2 years ago
- Spark Extension : ML transformers, SQL aggregations, etc that are missing in Apache Spark☆146Jan 26, 2016Updated 10 years ago
- HiBench is a big data benchmark suite.☆1,489Dec 15, 2025Updated 2 months ago
- An Apache Mesos Framework that allows for replaying load over and over and over (and over) again☆10Aug 10, 2015Updated 10 years ago
- Profile how CUDA applications create and modify data in memory.☆14Mar 22, 2018Updated 7 years ago
- Automatically exported from code.google.com/p/cluster-scheduler-simulator☆171Jun 3, 2022Updated 3 years ago
- Big Spatial Data Processing using Spark☆146Mar 7, 2017Updated 9 years ago
- An R-like GLM package for Apache Spark☆10Aug 6, 2015Updated 10 years ago
- An AWS SDK-backed FileSystem driver for Hadoop☆64Oct 13, 2020Updated 5 years ago
- Sparse feature extraction with Spark☆30Jul 25, 2018Updated 7 years ago
- Live-updating Spark UI built with Meteor☆190Apr 6, 2021Updated 4 years ago
- A scala dsl for dataflow☆11Dec 31, 2014Updated 11 years ago
- Project ARES represents a joint effort between LANL and ORNL to introduce a common compiler representation and tool-chain for HPC applica…☆10Nov 30, 2016Updated 9 years ago
- ☆13Jan 16, 2019Updated 7 years ago
- Fine-Grained Distributed Computing☆11Feb 15, 2016Updated 10 years ago
- Interactive and Reactive Data Science using Scala and Spark.☆3,150May 16, 2023Updated 2 years ago
- https://github.com/apache/incubator-myriad is our new home. See☆252Dec 2, 2015Updated 10 years ago
- Sparrow scheduling platform (U.C. Berkeley).☆329Jul 25, 2020Updated 5 years ago
- Statistical Workload Injector for MapReduce - Project at UC Berkeley AMP Lab☆129May 29, 2014Updated 11 years ago