ExpediaGroup/waggle-dance

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/ExpediaGroup/waggle-dance)

ExpediaGroup / waggle-dance

Hive federation service. Enables disparate tables to be concurrently accessed across multiple Hive deployments.

☆288

Alternatives and similar repositories for waggle-dance

Users that are interested in waggle-dance are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

ExpediaGroup / circus-train
View on GitHub
Circus Train is a dataset replication tool that copies Hive tables between clusters and clouds.
☆93Mar 5, 2024Updated 2 years ago
ExpediaGroup / shunting-yard
View on GitHub
Shunting Yard is a real-time data replication tool that copies data between Hive Metastores.
☆20Oct 11, 2021Updated 4 years ago
ExpediaGroup / beekeeper
View on GitHub
Service for automatically managing and cleaning up unreferenced data
☆50Apr 24, 2026Updated 2 months ago
ExpediaGroup / drone-fly
View on GitHub
A service which allows Hive Metastore Listeners to be deployed outside of the Hive Metastore Service
☆13Jun 30, 2026Updated 3 weeks ago
ExpediaGroup / beeju
View on GitHub
JUnit integration for testing the Apache Hive Metastore and HiveServer2 Thrift APIs
☆26Jul 22, 2025Updated last year
Deploy on Railway without the complexity - Free Credits Offer • Ad
Connect your repo and Railway handles the rest with instant previews. Quickly provision container image services, databases, and storage volumes.
ExpediaGroup / apiary
View on GitHub
Apiary provides modules which can be combined to create a federated cloud data lake
☆38Apr 3, 2024Updated 2 years ago
linkedin / coral
View on GitHub
Coral is a translation, analysis, and query rewrite engine for SQL and other relational languages.
☆907Updated this week
ExpediaGroup / datasqueeze
View on GitHub
Hadoop utility to compact small files
☆18Feb 16, 2026Updated 5 months ago
apache / kyuubi
View on GitHub
Apache Kyuubi is a distributed and multi-tenant gateway to provide serverless SQL on data warehouses and lakehouses.
☆2,353Updated this week
HiveRunner / mutant-swarm
View on GitHub
Mutation testing framework and code coverage for Hive SQL
☆24May 11, 2021Updated 5 years ago
lyft / presto-gateway
View on GitHub
A load balancer / proxy / gateway for prestodb
☆359Jul 25, 2024Updated last year
apache / gluten
View on GitHub
Gluten is a middle layer responsible for offloading JVM-based SQL engines' execution to native engines.
☆1,576Updated this week
apache / uniffle
View on GitHub
Uniffle is a high performance, general purpose Remote Shuffle Service.
☆451Updated this week
apache / celeborn
View on GitHub
Apache Celeborn is an elastic and high-performance service for shuffle and spilled data.
☆1,056Updated this week
1-Click AI Models by DigitalOcean Gradient • Ad
Deploy popular AI models on DigitalOcean Gradient GPU virtual machines with just a single click. Zero configuration with optimized deployments.
apache / amoro
View on GitHub
Apache Amoro(incubating) is a Lakehouse management system built on open data lake formats.
☆1,151Updated this week
linkedin / dr-elephant
View on GitHub
Dr. Elephant is a job and flow-level performance monitoring and tuning tool for Apache Hadoop and Apache Spark
☆1,370Aug 22, 2023Updated 2 years ago
hortonworks / hive-testbench
View on GitHub
☆392Jan 25, 2024Updated 2 years ago
insightlake / Ranger-Metastore-Plugin
View on GitHub
Ranger Hive Metastore Plugin
☆18Jul 21, 2023Updated 3 years ago
cubefs / compass
View on GitHub
Compass is a task diagnosis platform for bigdata
☆405Nov 23, 2024Updated last year
CoxAutomotiveDataSolutions / spark-distcp
View on GitHub
A re-implementation of Hadoop DistCP in Apache Spark
☆47Dec 20, 2023Updated 2 years ago
linkedin / transport
View on GitHub
A framework for writing performant user-defined functions (UDFs) that are portable across a variety of engines including Apache Spark, Ap…
☆306Jun 29, 2026Updated 3 weeks ago
ExpediaGroup / corc
View on GitHub
An ORC File Scheme for the Cascading data processing platform.
☆14Aug 26, 2021Updated 4 years ago
qubole / spark-acid
View on GitHub
ACID Data Source for Apache Spark based on Hive ACID
☆97Jul 7, 2021Updated 5 years ago
Proton VPN Special Offer - Get 70% off • Ad
Special partner offer. Trusted by over 100 million users worldwide. Tested, Approved and Recommended by Experts.
NetEase / spark-ranger
View on GitHub
ACL Management for Apache Spark SQL with Apache Ranger
☆17Jun 18, 2020Updated 6 years ago
Intel-bigdata / SSM
View on GitHub
Smart Storage Management for Big Data, a comprehensive hot/cold data optimized solution
☆139Jan 3, 2023Updated 3 years ago
ExpediaGroup / plunger
View on GitHub
A unit testing framework for the Cascading data processing platform.
☆25Aug 25, 2021Updated 4 years ago
paypal / NNAnalytics
View on GitHub
NameNodeAnalytics is a self-help utility for scouting and maintaining the namespace of an HDFS instance.
☆121Nov 25, 2025Updated 7 months ago
varadaio / presto-workload-analyzer
View on GitHub
The Workload Analyzer collects Presto® and Trino workload statistics, and analyzes them
☆136Oct 25, 2023Updated 2 years ago
aistack / sql-booster
View on GitHub
This is a library for SQL optimizing/rewriting including Materialized View rewrite
☆70Jun 21, 2022Updated 4 years ago
oap-project / sql-ds-cache
View on GitHub
Spark* plug-in for accelerating Spark* SQL performance by using cache and index at SQL data source layer.
☆37Jan 3, 2023Updated 3 years ago
uber / RemoteShuffleService
View on GitHub
Remote shuffle service for Apache Spark to store shuffle data on remote servers.
☆335Sep 29, 2023Updated 2 years ago
AbsaOSS / spline-spark-agent
View on GitHub
Spline agent for Apache Spark
☆207Updated this week
AI Agents on DigitalOcean Gradient AI Platform • Ad
Build production-ready AI agents using customizable tools or access multiple LLMs through a single endpoint. Create custom knowledge bases or connect external data.
yaooqinn / spark-authorizer
View on GitHub
A Spark SQL extension which provides SQL Standard Authorization for Apache Spark | This repo is contributed to Apache Kyuubi | 项目已迁移至 Apa…
☆183Apr 6, 2022Updated 4 years ago
Netflix / metacat
View on GitHub
☆1,687Updated this week
airbnb / reair
View on GitHub
ReAir is a collection of easy-to-use tools for replicating tables and partitions between Hive data warehouses.
☆282Feb 27, 2019Updated 7 years ago
awesome-kyuubi / hadoop-testing
View on GitHub
Testing Sandbox for Hadoop Ecosystem Components
☆45Jun 16, 2026Updated last month
mr3project / hive-mr3
View on GitHub
Hive for MR3
☆39Updated this week
hortonworks-spark / spark-atlas-connector
View on GitHub
A Spark Atlas connector to track data lineage in Apache Atlas
☆268Nov 16, 2022Updated 3 years ago
apache / ranger
View on GitHub
Apache Ranger - To enable, monitor and manage comprehensive data security across the Hadoop platform and beyond
☆1,065Updated this week