airbnb/reair

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/airbnb/reair)

airbnb / reair

ReAir is a collection of easy-to-use tools for replicating tables and partitions between Hive data warehouses.

☆282

Alternatives and similar repositories for reair

Users that are interested in reair are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

linkedin / dr-elephant
View on GitHub
Dr. Elephant is a job and flow-level performance monitoring and tuning tool for Apache Hadoop and Apache Spark
☆1,370Aug 22, 2023Updated 2 years ago
ExpediaGroup / circus-train
View on GitHub
Circus Train is a dataset replication tool that copies Hive tables between clusters and clouds.
☆93Mar 5, 2024Updated 2 years ago
apache / gobblin
View on GitHub
A distributed data integration framework that simplifies common aspects of big data integration such as data ingestion, replication, orga…
☆2,269Updated this week
airbnb / airpal
View on GitHub
Web UI for PrestoDB.
☆2,745May 20, 2021Updated 5 years ago
ExpediaGroup / waggle-dance
View on GitHub
Hive federation service. Enables disparate tables to be concurrently accessed across multiple Hive deployments.
☆288Jun 25, 2026Updated last month
Deploy to Railway using AI coding agents - Free Credits Offer • Ad
Use Claude Code, Codex, OpenCode, and more. Autonomous software development now has the infrastructure to match with Railway.
apache / eagle
View on GitHub
Mirror of Apache Eagle
☆411Aug 22, 2020Updated 5 years ago
cloudera / impala-udf-samples
View on GitHub
Sample UDF and UDAs for Impala.
☆62Sep 19, 2025Updated 10 months ago
dstreev / hdp-data-gen
View on GitHub
Hortonworks Data Platform Data Generation Tool
☆13Nov 30, 2017Updated 8 years ago
apache / phoenix-tephra
View on GitHub
Mirror of Apache Tephra (Incubating)
☆33May 15, 2026Updated 2 months ago
Netflix / metacat
View on GitHub
☆1,688Jul 16, 2026Updated last week
apache / incubator-hivemall
View on GitHub
Mirror of Apache Hivemall (incubating)
☆313Sep 6, 2022Updated 3 years ago
spark-jobserver / spark-jobserver
View on GitHub
REST job server for Apache Spark
☆2,836Mar 3, 2026Updated 4 months ago
dropbox / PyHive
View on GitHub
Python interface to Hive and Presto. 🐝
☆1,696Apr 13, 2026Updated 3 months ago
apache / incubator-retired-slider
View on GitHub
Mirror of Apache Slider
☆79Dec 11, 2018Updated 7 years ago
Deploy to Railway using AI coding agents - Free Credits Offer • Ad
Use Claude Code, Codex, OpenCode, and more. Autonomous software development now has the infrastructure to match with Railway.
yaooqinn / spark-authorizer
View on GitHub
A Spark SQL extension which provides SQL Standard Authorization for Apache Spark | This repo is contributed to Apache Kyuubi | 项目已迁移至 Apa…
☆183Apr 6, 2022Updated 4 years ago
apache / lens
View on GitHub
Mirror of Apache Lens
☆63Nov 5, 2019Updated 6 years ago
apache / trafodion
View on GitHub
Apache Trafodion
☆246Jun 7, 2021Updated 5 years ago
airbnb / SpinalTap
View on GitHub
Change Data Capture (CDC) service
☆450Jun 24, 2024Updated 2 years ago
qubole / spark-acid
View on GitHub
ACID Data Source for Apache Spark based on Hive ACID
☆97Jul 7, 2021Updated 5 years ago
airbnb / omniduct
View on GitHub
A toolkit providing a uniform interface for connecting to and extracting data from a wide variety of (potentially remote) data stores (in…
☆257Apr 22, 2026Updated 3 months ago
hbutani / spark-druid-olap
View on GitHub
Sparkline BI Accelerator provides fast ad-hoc query capability over Logical Cubes. This has been folded into our SNAP Platform(http://bit…
☆281Aug 3, 2018Updated 7 years ago
cloudera / hs2client
View on GitHub
C++ native client for Impala and Hive, with Python / pandas bindings
☆72Aug 15, 2018Updated 7 years ago
uber-archive / AthenaX
View on GitHub
SQL-based streaming analytics platform at scale
☆1,223Jun 21, 2020Updated 6 years ago
Deploy on Railway without the complexity - Free Credits Offer • Ad
Connect your repo and Railway handles the rest with instant previews. Quickly provision container image services, databases, and storage volumes.
miguel10 / YARN-Memory-Calculator
View on GitHub
Hadoop YARN & MapReduce Memory Calculator
☆13Nov 9, 2015Updated 10 years ago
linkedin / transport
View on GitHub
A framework for writing performant user-defined functions (UDFs) that are portable across a variety of engines including Apache Spark, Ap…
☆306Updated this week
qubole / rubix
View on GitHub
Cache File System optimized for columnar formats and object stores
☆188Aug 11, 2022Updated 3 years ago
lightcopy / parquet-index
View on GitHub
Spark SQL index for Parquet tables
☆134May 6, 2021Updated 5 years ago
cloudera-labs / SparkOnHBase
View on GitHub
SparkOnHBase
☆278Mar 30, 2021Updated 5 years ago
srikalyc / Sql4D
View on GitHub
Sql interface to druid.
☆78Dec 14, 2015Updated 10 years ago
apache / phoenix
View on GitHub
Apache Phoenix
☆1,059Updated this week
hortonworks-spark / shc
View on GitHub
The Apache Spark - Apache HBase Connector is a library to support Spark accessing HBase table as external data source or sink.
☆546May 10, 2021Updated 5 years ago
airbnb / aerosolve
View on GitHub
A machine learning package built for humans.
☆4,810Nov 6, 2025Updated 8 months ago
1-Click AI Models by DigitalOcean Gradient • Ad
Deploy popular AI models on DigitalOcean Gradient GPU virtual machines with just a single click. Zero configuration with optimized deployments.
apache / kyuubi
View on GitHub
Apache Kyuubi is a distributed and multi-tenant gateway to provide serverless SQL on data warehouses and lakehouses.
☆2,354Updated this week
apache / hudi
View on GitHub
Upserts, Deletes And Incremental Processing on Big Data.
☆6,200Updated this week
qubole / presto-udfs
View on GitHub
Plugin for Presto to allow addition of user functions easily
☆119Mar 31, 2021Updated 5 years ago
confluentinc / kafka-connect-hdfs
View on GitHub
Kafka Connect HDFS connector
☆27Updated this week
LinkedInAttic / camus
View on GitHub
LinkedIn's previous generation Kafka to HDFS pipeline.
☆879Aug 27, 2020Updated 5 years ago
sheepkiller / presto-marathon-docker
View on GitHub
On demand presto cluster with mesos, marathon and docker.
☆29Mar 7, 2018Updated 8 years ago
airbnb / knowledge-repo
View on GitHub
A next-generation curated knowledge sharing platform for data scientists and other technical professions.
☆5,538Sep 4, 2024Updated last year