A minimal docker compose setup for experimenting with cloud agnostic Lakehouse Architectures Apache Spark with Hive Metastore + Delta Lake + MinIO
☆34Apr 17, 2024Updated 2 years ago
Alternatives and similar repositories for spark-minio-delta-lakehouse-docker
Users that are interested in spark-minio-delta-lakehouse-docker are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- Simple project using pyflink, kafka and postgre containerized using Docker☆11Aug 26, 2024Updated last year
- Query Iceberg in Trino, Nessie as Catalog, and use minio to replace AWS S3☆27Aug 7, 2025Updated 8 months ago
- trino + hive + minio with postgres in docker compose☆27Aug 18, 2023Updated 2 years ago
- IceDB S3 Proxy to trick S3 clients into only seeing alive files☆13Dec 24, 2023Updated 2 years ago
- Variable Selection Network with PyTorch☆11May 29, 2024Updated last year
- GPUs on demand by Runpod - Special Offer Available • AdRun AI, ML, and HPC workloads on powerful cloud GPUs—without limits or wasted spend. Deploy GPUs in under a minute and pay by the second.
- ☆19Jun 12, 2025Updated 10 months ago
- Official Dockerfile for Delta Lake☆62Feb 24, 2026Updated 2 months ago
- Real-world AI engineering dataset creation, SFT fine-tuning, and GRPO alignment ETL pipeline.☆33Aug 27, 2025Updated 8 months ago
- Google Cloud Platform solution that provides an event driven process that flattens (unnests) Google Analytics 360 data that has been expo…☆16Apr 13, 2026Updated 2 weeks ago
- O'Neil et al.'s Star Schema Benchmark: curated code☆20May 19, 2025Updated 11 months ago
- End to End RAG LLM AI Assistant using LangChain, Llama3, Gemma2, OpenAI, FlaskAPI, Grafana☆11Nov 24, 2025Updated 5 months ago
- ☆13Updated this week
- An Elite: Dangerous Market Connector (EDMC) plug-in to track minor faction activity in the game Elite: Dangerous.☆13Apr 25, 2026Updated last week
- Useful generic types for Go☆25Apr 25, 2026Updated last week
- Deploy open-source AI quickly and easily - Special Bonus Offer • AdRunpod Hub is built for open source. One-click deployment and autoscaling endpoints without provisioning your own infrastructure.
- ☆16Jul 25, 2025Updated 9 months ago
- Apache arrow examples in golang☆15Apr 27, 2021Updated 5 years ago
- Hadoop-Hive-Spark cluster + Jupyter on Docker☆86Jan 2, 2025Updated last year
- High Performance Go Driver for Bytehouse☆14Jun 11, 2025Updated 10 months ago
- Scheduler of events for near real-time systems☆31Aug 21, 2025Updated 8 months ago
- Example repo for web scraping with Sveltekit API routes, Puppeteer, and Vercel Blob Storage☆12May 7, 2024Updated last year
- dbt + Trino demo project, using TPC-H sample data☆19Mar 27, 2024Updated 2 years ago
- TAU Vehicle Type Recognition Competition☆19Dec 18, 2019Updated 6 years ago
- universal-datalakehouse-postgres-ingestion-deltastreamer☆11Apr 7, 2024Updated 2 years ago
- 1-Click AI Models by DigitalOcean Gradient • AdDeploy popular AI models on DigitalOcean Gradient GPU virtual machines with just a single click. Zero configuration with optimized deployments.
- Fine-tuning a quantized Llama 2 chat model on Q&A pairs from counselchat.com to provide empathetic and appropriate mental health advice☆14Oct 17, 2023Updated 2 years ago
- Crawlyx is an open-source command-line interface (CLI) based web crawler built using Node.js. It is designed to crawl websites and extrac…☆13Apr 12, 2025Updated last year
- Purple CMS - Purple is Awesome☆18Jan 27, 2024Updated 2 years ago
- This repository contains examples for my article published on Medium☆11Oct 29, 2017Updated 8 years ago
- A Golang DuckDB library that doesn't require CGO☆20Jan 24, 2025Updated last year
- ☆19Jul 8, 2024Updated last year
- a pytest plugin for dbt adapter test suites☆19Oct 31, 2023Updated 2 years ago
- ☆27Mar 22, 2024Updated 2 years ago
- Example of project using Databricks Asset Bundle☆45Aug 6, 2024Updated last year
- End-to-end encrypted cloud storage - Proton Drive • AdSpecial offer: 40% Off Yearly / 80% Off First Month. Protect your most important files, photos, and documents from prying eyes.
- Sample Data Lakehouse deployed in Docker containers using Apache Iceberg, Minio, Trino and a Hive Metastore. Can be used for local testin…☆78Sep 2, 2023Updated 2 years ago
- Create Greenplum docker files☆11Aug 8, 2023Updated 2 years ago
- MLOps Implementation for Disaster Tweets Classifier Application☆24Mar 24, 2024Updated 2 years ago
- Delta-Lake, ETL, Spark, Airflow☆49Oct 9, 2022Updated 3 years ago
- Learn TypeScript through a series of refactorings to existing JavaScript code.☆13Oct 19, 2018Updated 7 years ago
- ☆23Mar 20, 2024Updated 2 years ago
- Implement D*Lite and A* Algorithm on Processing environment☆11Apr 7, 2017Updated 9 years ago