spektom / spark-flamegraphLinks

Easy CPU Profiling for Apache Spark applications

☆48

Alternatives and similar repositories for spark-flamegraph

Users that are interested in spark-flamegraph are comparing it to the libraries listed below

Sorting:

swoop-inc / spark-records
Bulletproof Apache Spark jobs with fast root cause analysis of failures.
☆73Updated 4 years ago
swoop-inc / spark-alchemy
Collection of open-source Spark tools & frameworks that have made the data engineering and data science teams at Swoop highly productive
☆185Updated last month
lightcopy / parquet-index
Spark SQL index for Parquet tables
☆134Updated 4 years ago
HeartSaVioR / spark-state-tools
Spark Structured Streaming State Tools
☆34Updated 5 years ago
rdblue / s3committer
Hadoop output committers for S3
☆111Updated 5 years ago
qubole / rubix
Cache File System optimized for columnar formats and object stores
☆185Updated 3 years ago
criteo / babar
Profiler for large-scale distributed java applications (Spark, Scalding, MapReduce, Hive,...) on YARN.
☆128Updated 7 years ago
chermenin / spark-states
Custom state store providers for Apache Spark
☆92Updated 9 months ago
nielsbasjes / splittablegzip
Splittable Gzip codec for Hadoop
☆74Updated last month
cerndb / SparkPlugins
Code and examples of how to write and deploy Apache Spark Plugins. Spark plugins allow runnig custom code on the executors as they are in…
☆94Updated 6 months ago
KeithSSmith / spark-compaction
File compaction tool that runs on top of the Spark framework.
☆59Updated 6 years ago
ExpediaGroup / circus-train
Circus Train is a dataset replication tool that copies Hive tables between clusters and clouds.
☆91Updated last year
qubole / spark-acid
ACID Data Source for Apache Spark based on Hive ACID
☆97Updated 4 years ago
squito / spark-memory
A tool to get better debug info on spark's memory usage
☆42Updated 6 years ago
rymurr / flight-spark-source
☆107Updated 2 years ago
qubole / quark
Quark is a data virtualization engine over analytic databases.
☆100Updated 8 years ago
starburstdata / facebook-presto
Starburst Enterprise Distribution of Presto
☆45Updated 4 years ago
metamx / druid-spark-batch
Druid indexing plugin for using Spark in batch jobs
☆101Updated 4 years ago
yaooqinn / itachi
A library that brings useful functions from various modern database management systems to Apache Spark
☆60Updated 2 years ago
ibm-research-ireland / sparkoscope
Enabling Spark Optimization through Cross-stack Monitoring and Visualization
☆47Updated 8 years ago
SaurabhChawla100 / spark-radiant
Spark-Radiant is Apache Spark Performance and Cost Optimizer
☆25Updated 10 months ago
hortonworks-spark / spark-schema-registry
Schema Registry integration for Apache Spark
☆40Updated 3 years ago
CoxAutomotiveDataSolutions / waimak
Waimak is an open-source framework that makes it easier to create complex data flows in Apache Spark.
☆76Updated last year
hammerlab / yarn-logs-helpers
Scripts for parsing / making sense of yarn logs
☆52Updated 9 years ago
funkyminds / cleanframes
type-class based data cleansing library for Apache Spark SQL
☆78Updated 6 years ago
apache / datasketches-website
Website for DataSketches.
☆105Updated last month
FRosner / drunken-data-quality
Spark package for checking data quality
☆222Updated 5 years ago
FINRAOS / MegaSparkDiff
A Spark-based data comparison tool at scale which facilitates software development engineers to compare a plethora of pair combinations o…
☆52Updated 5 months ago
hammerlab / spree
Live-updating Spark UI built with Meteor
☆189Updated 4 years ago
maropu / spark-tpcds-datagen
All the things about TPC-DS in Apache Spark
☆108Updated 2 years ago