A minimal docker compose setup for experimenting with cloud agnostic Lakehouse Architectures Apache Spark with Hive Metastore + Delta Lake + MinIO
☆34Apr 17, 2024Updated last year
Alternatives and similar repositories for spark-minio-delta-lakehouse-docker
Users that are interested in spark-minio-delta-lakehouse-docker are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- ☆22Feb 5, 2024Updated 2 years ago
- Query Iceberg in Trino, Nessie as Catalog, and use minio to replace AWS S3☆26Aug 7, 2025Updated 7 months ago
- IceDB S3 Proxy to trick S3 clients into only seeing alive files☆13Dec 24, 2023Updated 2 years ago
- Variable Selection Network with PyTorch☆11May 29, 2024Updated last year
- ☆19Jun 12, 2025Updated 9 months ago
- Official Dockerfile for Delta Lake☆61Feb 24, 2026Updated 3 weeks ago
- Real-world AI engineering dataset creation, SFT fine-tuning, and GRPO alignment ETL pipeline.☆33Aug 27, 2025Updated 6 months ago
- Python interface to the OpenVidu WebRTC videoconverence library.☆11Feb 2, 2026Updated last month
- Google Cloud Platform solution that provides an event driven process that flattens (unnests) Google Analytics 360 data that has been expo…☆16Sep 9, 2021Updated 4 years ago
- O'Neil et al.'s Star Schema Benchmark: curated code☆20May 19, 2025Updated 10 months ago
- End to End RAG LLM AI Assistant using LangChain, Llama3, Gemma2, OpenAI, FlaskAPI, Grafana☆11Nov 24, 2025Updated 3 months ago
- Useful generic types for Go☆24Updated this week
- ☆16Jul 25, 2025Updated 7 months ago
- Apache arrow examples in golang☆15Apr 27, 2021Updated 4 years ago
- ☆39Jan 13, 2026Updated 2 months ago
- Documentation of Hologres☆13Aug 18, 2020Updated 5 years ago
- ☆16Mar 9, 2026Updated 2 weeks ago
- Add accent for Vietnamese. N-Grams + Beam search, LSTM, Transformer, Evolved Transformer☆18Feb 3, 2021Updated 5 years ago
- Hadoop-Hive-Spark cluster + Jupyter on Docker☆86Jan 2, 2025Updated last year
- Scheduler of events for near real-time systems☆31Aug 21, 2025Updated 7 months ago
- dbt + Trino demo project, using TPC-H sample data☆19Mar 27, 2024Updated last year
- Benchmark☆17Jul 3, 2024Updated last year
- Toy Hadoop cluster combining various SQL-on-Hadoop variants☆13Nov 16, 2017Updated 8 years ago
- A Golang DuckDB library that doesn't require CGO☆20Jan 24, 2025Updated last year
- ☆19Jul 8, 2024Updated last year
- Official codes of the 1st place for The NVIDIA AI City Challenge 2023 - Track 2☆19Jul 25, 2023Updated 2 years ago
- Sample Data Lakehouse deployed in Docker containers using Apache Iceberg, Minio, Trino and a Hive Metastore. Can be used for local testin…☆75Sep 2, 2023Updated 2 years ago
- Instant access to the Spark cluster from anywhere☆16Nov 10, 2020Updated 5 years ago
- This is a sample for installing Kubernetes on Bare metals Production servers ( Ubuntu distro )☆10Jan 9, 2021Updated 5 years ago
- Delta-Lake, ETL, Spark, Airflow☆48Oct 9, 2022Updated 3 years ago
- Training and evaluating phase☆13Apr 1, 2021Updated 4 years ago
- Building a highly scalable Machine Learning System☆27Dec 3, 2024Updated last year
- Flink, Presto, Trino TPC-DS benchmark☆15Feb 20, 2023Updated 3 years ago
- Implement D*Lite and A* Algorithm on Processing environment☆11Apr 7, 2017Updated 8 years ago
- Modern games store web application built with React and Spring☆11Dec 15, 2023Updated 2 years ago
- Arduino sketch for SJCAM action camera's running on a esp8266 esp01☆13Aug 25, 2016Updated 9 years ago
- ☆41Jul 4, 2022Updated 3 years ago
- Importing AdventureWorks (SQL Server Sample Database) to Neo4j☆15Jun 17, 2025Updated 9 months ago
- Cloud-hosted Database Performance Data☆18Updated this week