kayousterhout / trace-analysisView external linksLinks
Scripts to analyze Spark's performance
☆136May 20, 2018Updated 7 years ago
Alternatives and similar repositories for trace-analysis
Users that are interested in trace-analysis are comparing it to the libraries listed below
Sorting:
- Benchmark Suite for Apache Spark☆241Apr 12, 2023Updated 2 years ago
- Scripts for generating Grafana dashboards for monitoring Spark jobs☆242Mar 26, 2015Updated 10 years ago
- Elastic Sentiment Analysis (using Apache Mesos, Marathon and Apache Spark)☆35Mar 16, 2015Updated 10 years ago
- TeraSort for Spark and Flink which uses a range partitioner based on sampling☆22Feb 5, 2016Updated 10 years ago
- Simplifying robust end-to-end machine learning on Apache Spark.☆476Apr 18, 2017Updated 8 years ago
- Fluent Scala DSL for Google's Cloud Dataflow SDK☆56Aug 2, 2015Updated 10 years ago
- Additional useful algorithms that can be used with spark.☆24Dec 24, 2014Updated 11 years ago
- A tool for running Spark on Google Compute Engine☆16Jan 20, 2017Updated 9 years ago
- An efficient updatable key-value store for Apache Spark☆254Mar 11, 2017Updated 8 years ago
- Mesos Integration Tests on Docker/Ec2☆15May 25, 2023Updated 2 years ago
- Implementation based on OSDI paper☆20Feb 11, 2018Updated 8 years ago
- Complete Pipeline Training at Big Data Scala By the Bay☆71Oct 27, 2015Updated 10 years ago
- Benchmarks of BLAS libraries with Scala interface☆30Jan 21, 2016Updated 10 years ago
- GPU Acceleration for Apache Spark☆34Aug 24, 2015Updated 10 years ago
- Grappa: scaling irregular applications on commodity clusters☆159May 4, 2017Updated 8 years ago
- Automatic offload of user-written Spark kernels to accelerators☆18Oct 25, 2016Updated 9 years ago
- MLeap allows for easily putting Spark ML pipelines into production☆78Oct 27, 2016Updated 9 years ago
- Enabling queries on compressed data.☆282Dec 16, 2023Updated 2 years ago
- Spark Terasort☆121Apr 21, 2023Updated 2 years ago
- Spark Extension : ML transformers, SQL aggregations, etc that are missing in Apache Spark☆146Jan 26, 2016Updated 10 years ago
- HiBench is a big data benchmark suite.☆1,489Dec 15, 2025Updated last month
- Profile how CUDA applications create and modify data in memory.☆14Mar 22, 2018Updated 7 years ago
- An Apache Mesos Framework that allows for replaying load over and over and over (and over) again☆10Aug 10, 2015Updated 10 years ago
- Automatically exported from code.google.com/p/cluster-scheduler-simulator☆171Jun 3, 2022Updated 3 years ago
- DistML provide a supplement to mllib to support model-parallel on Spark☆169Feb 6, 2017Updated 9 years ago
- Big Spatial Data Processing using Spark☆147Mar 7, 2017Updated 8 years ago
- An R-like GLM package for Apache Spark☆10Aug 6, 2015Updated 10 years ago
- An AWS SDK-backed FileSystem driver for Hadoop☆64Oct 13, 2020Updated 5 years ago
- Sparse feature extraction with Spark☆30Jul 25, 2018Updated 7 years ago
- Live-updating Spark UI built with Meteor☆189Apr 6, 2021Updated 4 years ago
- Sincronia Implementation☆11Sep 11, 2018Updated 7 years ago
- A scala dsl for dataflow☆11Dec 31, 2014Updated 11 years ago
- Parallel ML System - Bosen Java implementation☆28Jan 23, 2017Updated 9 years ago
- ☆13Jan 16, 2019Updated 7 years ago
- Fine-Grained Distributed Computing☆11Feb 15, 2016Updated 9 years ago
- Interactive and Reactive Data Science using Scala and Spark.☆3,151May 16, 2023Updated 2 years ago
- Flow-level simulator for coflow scheduling used in Varys and Aalo☆47May 23, 2017Updated 8 years ago
- https://github.com/apache/incubator-myriad is our new home. See☆253Dec 2, 2015Updated 10 years ago
- Sparrow scheduling platform (U.C. Berkeley).☆328Jul 25, 2020Updated 5 years ago