Shunting Yard is a real-time data replication tool that copies data between Hive Metastores.
☆20Oct 11, 2021Updated 4 years ago
Alternatives and similar repositories for shunting-yard
Users that are interested in shunting-yard are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- Circus Train is a dataset replication tool that copies Hive tables between clusters and clouds.☆93Mar 5, 2024Updated 2 years ago
- Hive federation service. Enables disparate tables to be concurrently accessed across multiple Hive deployments.☆285Feb 24, 2026Updated last month
- Mutation testing framework and code coverage for Hive SQL☆24May 11, 2021Updated 4 years ago
- JUnit integration for testing the Apache Hive Metastore and HiveServer2 Thrift APIs☆26Jul 22, 2025Updated 8 months ago
- Tool for visualizing Apache Oozie pipelines☆12Feb 15, 2016Updated 10 years ago
- GPU virtual machines on DigitalOcean Gradient AI • AdGet to production fast with high-performance AMD and NVIDIA GPUs you can spin up in seconds. The definition of operational simplicity.
- A service which allows Hive Metastore Listeners to be deployed outside of the Hive Metastore Service☆13Mar 17, 2026Updated last week
- An ETL pipeline that extracts data from S3, stages them in Redshift, and transforms data into a set of dimensional tables☆15May 5, 2020Updated 5 years ago
- Apiary provides modules which can be combined to create a federated cloud data lake☆37Apr 3, 2024Updated last year
- Kafka Connector for Iceberg tables☆16Jul 24, 2023Updated 2 years ago
- A unit testing framework for the Cascading data processing platform.☆25Aug 25, 2021Updated 4 years ago
- kafka-cdc-redshift☆13Jul 2, 2024Updated last year
- Generate mock data based on an Apache Avro schema and specific cardinality settings☆10Apr 16, 2018Updated 7 years ago
- presto's elasticsearch connector☆11Dec 7, 2016Updated 9 years ago
- A Trino connector to access git repository contents☆18Feb 9, 2026Updated last month
- Bare Metal GPUs on DigitalOcean Gradient AI • AdPurpose-built for serious AI teams training foundational models, running large-scale inference, and pushing the boundaries of what's possible.
- A Trino ODBC driver☆14Jan 10, 2024Updated 2 years ago
- "hms-mirror" is a utility used to bridge the gap between two clusters and migrate hive metadata.☆18Nov 8, 2025Updated 4 months ago
- ☆21Mar 21, 2025Updated last year
- Simple functional examples of running Hadoop + Hive in Docker with Docker Compose☆25Dec 25, 2022Updated 3 years ago
- ☆20Sep 25, 2023Updated 2 years ago
- Proxy for S3☆18Feb 13, 2026Updated last month
- Codec for Hadoop adding OpenPGP encryption using Bouncy Castle☆17Aug 18, 2011Updated 14 years ago
- Cloud based Data Platform based on Apache Spark☆27Feb 17, 2026Updated last month
- An Open Source unit test framework for Hive queries based on JUnit 4 and 5☆262Jan 6, 2025Updated last year
- GPU virtual machines on DigitalOcean Gradient AI • AdGet to production fast with high-performance AMD and NVIDIA GPUs you can spin up in seconds. The definition of operational simplicity.
- Schedule a data pipeline in Google Cloud using cloud function, BigQuery, cloud storage, cloud scheduler, stack trace, cloud build, and p…☆26Jun 4, 2019Updated 6 years ago
- Generate big TPC-DS datasets with Databricks☆21Jan 3, 2022Updated 4 years ago
- A re-implementation of Hadoop DistCP in Apache Spark☆47Dec 20, 2023Updated 2 years ago
- Transporter for integrating OpenLineage with OpenMetadata☆17Sep 10, 2025Updated 6 months ago
- Hadoop output committers for S3☆114Jul 9, 2020Updated 5 years ago
- Stream Discovery and Stream Orchestration☆122Jan 7, 2026Updated 2 months ago
- Client swagger for nifi with security☆38May 20, 2022Updated 3 years ago
- A user friendly API for checking for and reporting on Avro schema incompatibilities.☆59Mar 5, 2024Updated 2 years ago
- Best practices and recommendations for getting started with Amazon EMR on EKS.☆68Jan 27, 2026Updated last month
- Virtual machines for every use case on DigitalOcean • AdGet dependable uptime with 99.99% SLA, simple security tools, and predictable monthly pricing with DigitalOcean's virtual machines, called Droplets.
- implementing an end-to-end tweets ETL/Analysis pipeline.☆59Dec 8, 2022Updated 3 years ago
- Path Finding Course Material☆15Aug 6, 2024Updated last year
- ☆33Oct 15, 2025Updated 5 months ago
- Spring Boot Demo application deployed to Amazon AWS☆16Feb 26, 2015Updated 11 years ago
- Unit test framework for hive and hive-service☆63Jun 29, 2022Updated 3 years ago
- ACID Data Source for Apache Spark based on Hive ACID☆96Jul 7, 2021Updated 4 years ago
- Spark package to "plug" holes in data using SQL based rules ⚡️ 🔌☆29May 15, 2020Updated 5 years ago