ExpediaGroup/circus-train

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/ExpediaGroup/circus-train)

ExpediaGroup / circus-train

Circus Train is a dataset replication tool that copies Hive tables between clusters and clouds.

☆93

Alternatives and similar repositories for circus-train

Users that are interested in circus-train are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

ExpediaGroup / shunting-yard
View on GitHub
Shunting Yard is a real-time data replication tool that copies data between Hive Metastores.
☆20Oct 11, 2021Updated 4 years ago
ExpediaGroup / waggle-dance
View on GitHub
Hive federation service. Enables disparate tables to be concurrently accessed across multiple Hive deployments.
☆288Jun 25, 2026Updated 3 weeks ago
ExpediaGroup / apiary
View on GitHub
Apiary provides modules which can be combined to create a federated cloud data lake
☆38Apr 3, 2024Updated 2 years ago
ExpediaGroup / datasqueeze
View on GitHub
Hadoop utility to compact small files
☆18Feb 16, 2026Updated 5 months ago
ExpediaGroup / beeju
View on GitHub
JUnit integration for testing the Apache Hive Metastore and HiveServer2 Thrift APIs
☆26Jul 22, 2025Updated 11 months ago
Deploy to Railway using AI coding agents - Free Credits Offer • Ad
Use Claude Code, Codex, OpenCode, and more. Autonomous software development now has the infrastructure to match with Railway.
ExpediaGroup / corc
View on GitHub
An ORC File Scheme for the Cascading data processing platform.
☆14Aug 26, 2021Updated 4 years ago
steveloughran / cloudstore
View on GitHub
Hadoop utility jar for troubleshooting integration with cloud object stores
☆38Jun 29, 2026Updated 3 weeks ago
airbnb / reair
View on GitHub
ReAir is a collection of easy-to-use tools for replicating tables and partitions between Hive data warehouses.
☆282Feb 27, 2019Updated 7 years ago
ExpediaGroup / stream-registry
View on GitHub
Stream Discovery and Stream Orchestration
☆124Jan 7, 2026Updated 6 months ago
ExpediaGroup / jasvorno
View on GitHub
A library for strong, schema based conversion between 'natural' JSON documents and Avro
☆18Mar 5, 2024Updated 2 years ago
aaronanderson / aws-java-sdk-v2-utils
View on GitHub
Pure JAX-RS 2.0 ClientRequestFilter/WriterInterceptor used to sign AWS REST requests. Also has presign capabilities.
☆15Jan 4, 2022Updated 4 years ago
hammerlab / spark-util
View on GitHub
low-level helpers for Apache Spark libraries and tests
☆16Dec 29, 2018Updated 7 years ago
ExpediaGroup / insights-explorer
View on GitHub
Insights Explorer is a tool to catalogue and present analytical & research work.
☆14Nov 26, 2024Updated last year
HubSpot / hbase-support
View on GitHub
Supporting configs and tools for HBase at HubSpot
☆17May 9, 2014Updated 12 years ago
Deploy to Railway using AI coding agents - Free Credits Offer • Ad
Use Claude Code, Codex, OpenCode, and more. Autonomous software development now has the infrastructure to match with Railway.
bolcom / hive_compared_bq
View on GitHub
hive_compared_bq compares/validates 2 (SQL like) tables, and graphically shows the rows/columns that are different.
☆27Dec 13, 2017Updated 8 years ago
ExpediaGroup / apiary-data-lake
View on GitHub
Terraform scripts for deploying Apiary Data Lake
☆19Apr 16, 2026Updated 3 months ago
VividCortex / lastseen
View on GitHub
Last-seen sketch implementation in Go
☆16Dec 15, 2020Updated 5 years ago
ExpediaGroup / drone-fly
View on GitHub
A service which allows Hive Metastore Listeners to be deployed outside of the Hive Metastore Service
☆13Jun 30, 2026Updated 3 weeks ago
qubole / rubix
View on GitHub
Cache File System optimized for columnar formats and object stores
☆188Aug 11, 2022Updated 3 years ago
sungchun12 / serverless-data-pipeline-gcp
View on GitHub
Schedule a data pipeline in Google Cloud using cloud function, BigQuery, cloud storage, cloud scheduler, stack trace, cloud build, and p…
☆25Jun 4, 2019Updated 7 years ago
mvanderlee / aiotrino
View on GitHub
☆21Mar 21, 2025Updated last year
HiveRunner / HiveRunner
View on GitHub
An Open Source unit test framework for Hive queries based on JUnit 4 and 5
☆262Jan 6, 2025Updated last year
ExpediaGroup / hiveberg
View on GitHub
Demonstration of a Hive Input Format for Iceberg
☆26Mar 12, 2021Updated 5 years ago
Deploy to Railway using AI coding agents - Free Credits Offer • Ad
Use Claude Code, Codex, OpenCode, and more. Autonomous software development now has the infrastructure to match with Railway.
qubole / spark-acid
View on GitHub
ACID Data Source for Apache Spark based on Hive ACID
☆97Jul 7, 2021Updated 5 years ago
uber-node / ringpop-common
View on GitHub
Cross-language compatible tools and documentation for Ringpop
☆24Jun 20, 2017Updated 9 years ago
linkedin / transport
View on GitHub
A framework for writing performant user-defined functions (UDFs) that are portable across a variety of engines including Apache Spark, Ap…
☆306Jun 29, 2026Updated 3 weeks ago
NitinSPatil15 / Project-3-Data-Warehouse-with-AWS
View on GitHub
An ETL pipeline that extracts data from S3, stages them in Redshift, and transforms data into a set of dimensional tables
☆16May 5, 2020Updated 6 years ago
apache / datasketches-memory
View on GitHub
High performance native memory access for Java.
☆134Jul 13, 2026Updated last week
lyft / presto-gateway
View on GitHub
A load balancer / proxy / gateway for prestodb
☆359Jul 25, 2024Updated last year
ground-context / ground
View on GitHub
An open-source, vendor-neutral data context service.
☆163Mar 6, 2018Updated 8 years ago
qubole / streamx
View on GitHub
kafka-connect-s3 : Ingest data from Kafka to Object Stores(s3)
☆96Apr 4, 2019Updated 7 years ago
microsoft / vscode-jupyter-hub
View on GitHub
Jupyter Hub Support in VS Code
☆17Jul 13, 2026Updated last week
Open source password manager - Proton Pass • Ad
Securely store, share, and autofill your credentials with Proton Pass, the end-to-end encrypted password manager trusted by millions.
ExpediaGroup / flyte
View on GitHub
Flyte binds together the tools you use into easily defined, automated workflows
☆88Mar 5, 2024Updated 2 years ago
51zero / eel-sdk
View on GitHub
Big Data Toolkit for the JVM
☆147Nov 4, 2020Updated 5 years ago
ricklon / USB-Arduino-Developer-Device
View on GitHub
Use an Arduino with with USB HID support to control a project in Git
☆13Jan 3, 2012Updated 14 years ago
BrynMawrCollege / jupyterhub
View on GitHub
Code for our local jupyterhub install
☆18Feb 28, 2018Updated 8 years ago
hortonworks-gallery / ambari-freeipa-service
View on GitHub
Ambari service for RedHat FreeIPA
☆11Sep 30, 2016Updated 9 years ago
tupol / spark-utils
View on GitHub
Basic framework utilities to quickly start writing production ready Apache Spark applications
☆36Dec 15, 2024Updated last year
edwardcapriolo / hive_test
View on GitHub
Unit test framework for hive and hive-service
☆65Jun 29, 2022Updated 4 years ago