kemonoske / spark-minio-delta-lakehouse-docker
A minimal docker compose setup for experimenting with cloud agnostic Lakehouse Architectures Apache Spark with Hive Metastore + Delta Lake + MinIO
☆14Updated 7 months ago
Related projects ⓘ
Alternatives and complementary repositories for spark-minio-delta-lakehouse-docker
- Query Iceberg in Trino, Nessie as Catalog, and use minio to replace AWS S3☆11Updated 6 months ago
- Apache Hive Metastore as a Standalone server in Docker☆67Updated 3 months ago
- Docker envinroment to stream data from Kafka to Iceberg tables☆24Updated 8 months ago
- Presto Trino with Apache Hive Postgres metastore☆37Updated 2 months ago
- Example for article Running Spark 3 with standalone Hive Metastore 3.0☆96Updated last year
- minio as local storage and DynamoDB as catalog☆11Updated 6 months ago
- ☆10Updated last year
- Minimal example to run Trino, Minio, and Hive standalone metastore on docker☆47Updated 2 years ago
- A tool that makes it easy to run modular Trino environments locally.☆33Updated this week
- ☆252Updated 3 weeks ago
- DuckDB for streaming data☆70Updated 7 months ago
- ☆52Updated this week
- Sample Data Lakehouse deployed in Docker containers using Apache Iceberg, Minio, Trino and a Hive Metastore. Can be used for local testin…☆57Updated last year
- ☆35Updated 2 years ago
- IceDB S3 Proxy to trick S3 clients into only seeing alive files☆12Updated 10 months ago
- ☆47Updated this week
- Streaming Synthetic Sales Data Generator: Streaming sales data generator for Apache Kafka, written in Python☆44Updated last year
- A project for exploring how Great Expectations can be used to ensure data quality and validate batches within a data pipeline defined in …☆21Updated 2 years ago
- The Data Product Descriptor Specification (DPDS) Repository☆70Updated last week
- Examples for using Apache Flink® with DataStream API, Table API, Flink SQL and connectors such as MySQL, JDBC, CDC, Kafka.☆59Updated last year
- ☆43Updated 3 months ago
- ☆22Updated 8 months ago
- ☆11Updated last month
- Demonstration of using Materialize in the context of an e-commerce business to power real-time dashboards and features.☆12Updated 2 years ago
- Pythonic Iceberg REST Catalog☆67Updated 2 months ago
- Starburst Metabase driver☆65Updated 4 months ago
- Redis Kafka Connector (Source and Sink) by Redis☆28Updated last week
- ☆13Updated 9 months ago