godatadriven-dockerhub / hive-metastore
Hadoop/Hive/Spark container to perform CI tests
☆11Updated 3 years ago
Related projects ⓘ
Alternatives and complementary repositories for hive-metastore
- Scalable CDC Pattern Implemented using PySpark☆18Updated 5 years ago
- ☆43Updated 3 months ago
- ☆13Updated last week
- Slowly Changing Dimension type 2 using Hive query language using exclusive join technique with ORC Hive tables, partitioned and clustered…☆16Updated 5 years ago
- ☆24Updated 2 months ago
- Connect DBVisualizer to Hortonwork HiveServer2☆9Updated 9 years ago
- Data Profiler for AWS Glue Data Catalog application as described in the AWS Big Data Blog post "Build an automatic data profiling and rep…☆19Updated 4 years ago
- A sample project for KSQL along with debezium and kafka connect☆15Updated 2 years ago
- Presto cluster on top of kubernetes☆9Updated 3 years ago
- This is a basic Apache Pinot example for ingesting real-time MySQL change logs using Debezium☆27Updated 3 years ago
- Demos for Nessie. Nessie provides Git-like capabilities for your Data Lake.☆28Updated last week
- Presto Trino with Apache Hive Postgres metastore☆37Updated 2 months ago
- This repository contains recipes for Apache Pinot.☆24Updated last month
- Sample processing code using Spark 2.1+ and Scala☆51Updated 4 years ago
- ☆11Updated last year
- ☆47Updated 7 months ago
- An open-source storage framework that enables building a Lakehouse architecture with compute engines including Spark, PrestoDB, Flink, Tr…☆10Updated last year
- minio as local storage and DynamoDB as catalog☆11Updated 5 months ago
- ☆27Updated 2 weeks ago
- Demos using Conduktor Gateway☆16Updated 6 months ago
- Amundsen Gremlin☆20Updated 2 years ago
- Code snippets used in demos recorded for the blog.☆29Updated 3 weeks ago
- Shunting Yard is a real-time data replication tool that copies data between Hive Metastores.☆20Updated 3 years ago
- Magic to help Spark pipelines upgrade☆33Updated last month
- Set of tools for creating backups, compaction and restoration of Apache Kafka® Clusters☆18Updated last week
- Pipeline library for StreamSets Data Collector and Transformer☆32Updated last year
- Apiary provides modules which can be combined to create a federated cloud data lake☆36Updated 7 months ago
- Spark on Kubernetes using Helm☆34Updated 4 years ago