criteo/babar

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/criteo/babar)

criteo / babar

Profiler for large-scale distributed java applications (Spark, Scalding, MapReduce, Hive,...) on YARN.

☆129

Alternatives and similar repositories for babar

Users that are interested in babar are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

criteo / garmadon
View on GitHub
Java event logs collector for hadoop and frameworks
☆42Mar 25, 2025Updated last year
linkedin / transport
View on GitHub
A framework for writing performant user-defined functions (UDFs) that are portable across a variety of engines including Apache Spark, Ap…
☆306Jun 29, 2026Updated 3 weeks ago
conversant / spark-profiler
View on GitHub
☆12May 16, 2017Updated 9 years ago
apache / kyuubi-client
View on GitHub
Client libraries of end users of Apache Kyuubi
☆11May 15, 2026Updated 2 months ago
criteo / cuttle
View on GitHub
An embedded job scheduler.
☆117Jul 29, 2024Updated last year
AI Agents on DigitalOcean Gradient AI Platform • Ad
Build production-ready AI agents using customizable tools or access multiple LLMs through a single endpoint. Create custom knowledge bases or connect external data.
Tencent / Firestorm
View on GitHub
Firestorm is a Remote Shuffle Service, and provides the capability for Apache Spark and Apache Hadoop MapReduce applications to store shu…
☆256Apr 7, 2023Updated 3 years ago
apache / kyuubi-docker
View on GitHub
Apache Kyuubi is a distributed and multi-tenant gateway to provide serverless SQL on data warehouses and lakehouses.
☆16May 22, 2026Updated 2 months ago
uber-common / jvm-profiler
View on GitHub
JVM Profiler Sending Metrics to Kafka, Console Output or Custom Reporter
☆1,804May 21, 2026Updated 2 months ago
mr-jstraub / ambari-node-view
View on GitHub
☆14Sep 18, 2016Updated 9 years ago
hortonworks-gallery / ambari-freeipa-service
View on GitHub
Ambari service for RedHat FreeIPA
☆11Sep 30, 2016Updated 9 years ago
LucaCanali / sparkMeasure
View on GitHub
This repository contains the development code for sparkMeasure, an Apache Spark performance analysis and troubleshooting library. It simp…
☆827May 19, 2026Updated 2 months ago
marcelmay / hfsa
View on GitHub
Hadoop FSImage Analyzer (HFSA)
☆68Jun 24, 2026Updated last month
jiashu-z / how-to-plot
View on GitHub
How to plot for papers, slides, demos, etc.
☆10Apr 7, 2022Updated 4 years ago
CoxAutomotiveDataSolutions / spark-distcp
View on GitHub
A re-implementation of Hadoop DistCP in Apache Spark
☆47Dec 20, 2023Updated 2 years ago
Proton VPN Special Offer - Get 70% off • Ad
Special partner offer. Trusted by over 100 million users worldwide. Tested, Approved and Recommended by Experts.
maropu / spark-sql-server
View on GitHub
Yet Another Spark SQL JDBC/ODBC server based on the PostgreSQL V3 protocol
☆34Sep 8, 2022Updated 3 years ago
mitdbg / amoeba
View on GitHub
☆17Aug 8, 2017Updated 8 years ago
cloudera-labs / hive-sre
View on GitHub
☆18Jan 17, 2025Updated last year
linkedin / dr-elephant
View on GitHub
Dr. Elephant is a job and flow-level performance monitoring and tuning tool for Apache Hadoop and Apache Spark
☆1,370Aug 22, 2023Updated 2 years ago
cerndb / Hadoop-Profiler
View on GitHub
Hadoop Profiler, or hprofiler, is a tool which is able to analyze on- and off-CPU workloads on distributed computing environments.
☆24Jul 7, 2016Updated 10 years ago
JerryLead / SparkProfiler
View on GitHub
Profiling Spark Applications for Performance Comparison and Diagnosis
☆16Nov 11, 2018Updated 7 years ago
steveloughran / cloudstore
View on GitHub
Hadoop utility jar for troubleshooting integration with cloud object stores
☆38Jun 29, 2026Updated 3 weeks ago
criteo / berilia
View on GitHub
Create hadoop cluster in aws ec2 for development
☆11Sep 8, 2017Updated 8 years ago
datafusion-contrib / datafusion-sqlancer
View on GitHub
(Archived) End-to-end SQL fuzz testing for DataFusion using SQLancer
☆13Apr 16, 2026Updated 3 months ago
Deploy on Railway without the complexity - Free Credits Offer • Ad
Connect your repo and Railway handles the rest with instant previews. Quickly provision container image services, databases, and storage volumes.
nevillelyh / parquet-extra
View on GitHub
A collection of Apache Parquet add-on modules
☆31Updated this week
qubole / sparklens
View on GitHub
Qubole Sparklens tool for performance tuning Apache Spark
☆592Jun 26, 2024Updated 2 years ago
criteo / tf-yarn
View on GitHub
Train TensorFlow models on YARN in just a few lines of code!
☆93Nov 3, 2023Updated 2 years ago
harvard-cns / Harvard-CNS-Seminar
View on GitHub
Reading seminar in Harvard Cloud Networking and Systems Group
☆16Aug 29, 2022Updated 3 years ago
hortonworks-spark / spark-llap
View on GitHub
☆102Mar 23, 2020Updated 6 years ago
tubular / confluent-spark-avro
View on GitHub
Spark UDFs to deserialize Avro messages with schemas stored in Schema Registry.
☆20Jan 11, 2018Updated 8 years ago
dstreev / hadoop-cli
View on GitHub
HADOOP-CLI is an interactive command line shell that makes interacting with the Hadoop Distribted Filesystem (HDFS) simpler and more intu…
☆39May 7, 2026Updated 2 months ago
hammerlab / grafana-spark-dashboards
View on GitHub
Scripts for generating Grafana dashboards for monitoring Spark jobs
☆240Mar 26, 2015Updated 11 years ago
rymurr / flight-spark-source
View on GitHub
☆109Jul 5, 2023Updated 3 years ago
AI Agents on DigitalOcean Gradient AI Platform • Ad
Build production-ready AI agents using customizable tools or access multiple LLMs through a single endpoint. Create custom knowledge bases or connect external data.
linkedin / coral
View on GitHub
Coral is a translation, analysis, and query rewrite engine for SQL and other relational languages.
☆907Updated this week
Downfy / log4j-elasticsearch-java-api
View on GitHub
Using log4j insert log info into ElasticSearch
☆26Oct 31, 2016Updated 9 years ago
qubole / rubix
View on GitHub
Cache File System optimized for columnar formats and object stores
☆188Aug 11, 2022Updated 3 years ago
nitsanw / safepoint-experiments
View on GitHub
☆11Feb 24, 2016Updated 10 years ago
ExpediaGroup / datasqueeze
View on GitHub
Hadoop utility to compact small files
☆18Feb 16, 2026Updated 5 months ago
Hydrospheredata / mist
View on GitHub
Serverless proxy for Spark cluster
☆325Apr 13, 2026Updated 3 months ago
ceph / cephfs-hadoop
View on GitHub
cephfs-hadoop
☆57Dec 10, 2020Updated 5 years ago