godatadriven-dockerhub / hive-metastoreLinks
Hadoop/Hive/Spark container to perform CI tests
☆11Updated 4 years ago
Alternatives and similar repositories for hive-metastore
Users that are interested in hive-metastore are comparing it to the libraries listed below
Sorting:
- ☆14Updated 3 weeks ago
- Connect DBVisualizer to Hortonwork HiveServer2☆9Updated 10 years ago
- Scalable CDC Pattern Implemented using PySpark☆18Updated 5 years ago
- Data Profiler for AWS Glue Data Catalog application as described in the AWS Big Data Blog post "Build an automatic data profiling and rep…☆20Updated 5 years ago
- Yet Another (Spark) ETL Framework☆21Updated last year
- Combination of Dockerized Hortonworks projects and other Hadoop ecosystem components☆11Updated 5 years ago
- Presto Trino with Apache Hive Postgres metastore☆42Updated 9 months ago
- This is a basic Apache Pinot example for ingesting real-time MySQL change logs using Debezium☆27Updated 4 years ago
- Using the Parquet file format (with Avro) to process data with Apache Flink☆14Updated 9 years ago
- Hands-on workshop with Iceberg, Redpanda, Debezium and Kafka-Connect☆13Updated 8 months ago
- Receipes of publicly-available Jupyter images☆8Updated 3 months ago
- Jumbune, an open source BigData APM & Data Quality Management Platform for Data Clouds. Enterprise feature offering is available at http:…☆70Updated 2 years ago
- resources for trying out a nessie-flink-iceberg setup☆11Updated last year
- Slowly Changing Dimension type 2 using Hive query language using exclusive join technique with ORC Hive tables, partitioned and clustered…☆16Updated 6 years ago
- ☆30Updated 2 weeks ago
- Pipeline library for StreamSets Data Collector and Transformer☆33Updated 2 years ago
- Sample code to collect Apache Iceberg metrics for table monitoring☆28Updated 10 months ago
- Presto cluster on top of kubernetes☆9Updated 3 years ago
- Delta Lake Examples☆12Updated 5 years ago
- Apiary provides modules which can be combined to create a federated cloud data lake☆36Updated last year
- ☆27Updated 2 weeks ago
- CDC with NiFi, Kafka Connect, Flink SQL, Cloudera Data in Motion☆12Updated last year
- Sample processing code using Spark 2.1+ and Scala☆51Updated 4 years ago
- Shunting Yard is a real-time data replication tool that copies data between Hive Metastores.☆20Updated 3 years ago
- Spark on Kubernetes using Helm☆34Updated 5 years ago
- Examples of Spark 3.0☆47Updated 4 years ago
- ☆13Updated last year
- Dione - a Spark and HDFS indexing library☆52Updated last year
- A sample project for KSQL along with debezium and kafka connect☆15Updated 2 years ago
- minio as local storage and DynamoDB as catalog☆15Updated last year